Changes:
message() instead of writeLines() in
printing functions.This major release introduces many changes. The three most important ones are (1) changing the settings objects, (2) supporting nesting cohorts, and (3) covariate balance significance testing.
All settings objects have been changed to R6 classes, and are now
used both when calling functions individually and when using
runCmAnalyses(). The main rationale is that allows 3rd
parties to more easily generate valid settings.
Dropped the cdmVersion argument in
getDbCohortMethodData() and runCmAnalyses().
The version will be identified in the cdm_source
table.
Dropped the trimByIptw() and
trimByPsToEquipoise() functions. Added
equipoiseBounds and maxWeight arguments to
createTrimByPsArgs() so functionality remains.
Dropped the matchOnPsAndCovariates() function and
added a stratificationCovariateIds argument to
createMatchOnPsArgs() so functionality remains.
Dropped the stratifyByPsAndCovariates() function and
added a stratificationCovariateIds argument to
createStratifyByPsPsArgs() so functionality
remains.
Renamed createStudyPopArgs argument of
createCmAnalysis() to
createStudyPopulationArgs for consistency.
Dropping the deprecated attritionFractionThreshold
argument of createCmDiagnosticThresholds(). The amount of
attrition is not a good measure of generalizability. Use the
generalizability diagnostic instead, which measures the similarity
between the target and analytic cohort characteristcs.
Changed the default outcome model type from ‘logistic’ to ‘cox’.
Set the defaults of
createGetDbCohortMethodDataArgs() to those most often
used.
Dropped the firstExposureOnly,
restrictToCommonPeriod, washoutPeriod, and
removeDuplicateSubjects arguments from
CreateStudyPopulationArgs. These were duplicated from
getDbCohortMethodData(), and we’ll keep them only there
from now on.
Added ability to restrict to a nesting cohort (e.g. restricting
drug exposures to a specific indication). See the
nestingCohortId argument in the
createGetDbCohortMethodDataArgs() and
createTargetComparatorOutcomes() functions and the
nestingCohortDatabaseSchema and
nestingCohortTable arguments in the
getDbCohortMethodData() function.
The results schema now includes the
target_comparator table that combines the
target_id, comparator_id, and
nesting_cohort_id into a single unique
target_comparator_id. This new ID is a hash of its
components, allowing results from multiple runs to be combined into a
single database.
In addition to restricting to a nesting cohort the population can
now also be restricted by age and gender using the minAge,
maxAge, and genderConceptIds arguments of
createGetDbCohortMethodDataArgs().
Added optional significance testing to covariate balance. This avoids failing the balance diagnostic on smaller databases just because of random chance, and was found to be superior in our methods research. This introduces the following changes to the interface:
threshold and alpha arguments to
the createComputeCovariateBalanceArgs() function. These do
not impact blinding when running runCmAnalyses but do add
columns to the balance files, for when running single studies.sdmAlpha argument to the
createCmDiagnosticThresholds() function.This adds the sdm_family_wise_min_p and
shared_sdm_family_wise_min_p fields to the
cm_diagnostics_summary table when exporting to CSV. For
now, the default is not to use significance testing, but the family-wise
min P can help understand if one would have passed when using
it.
Added a new option for the removeDuplicateSubjects
argument: “keep first, truncate to second”. This is similar to “keep
first”, but also truncates the first exposure to stop the day before the
second starts.
Now performing empirical calibration after removing estimates that fail diagnostics. In general this should lead to narrower calibrated confidence intervals.
If high correlation is detected when fitting a propensity model,
but stopOnError = FALSE, the export will show the highly
correlated covariates in the model with extreme coefficients (1e6 *
correlation).
Added the ability to use bootstrap for computing confidence
intervals. See the bootstrapCi and
bootstrapReplicates arguments of
createFitOutcomeModelArgs().
All restrictions on the study populations performed by
getDbCohortMethodData() are now step-by-step recorded in
the attrition table.
Completely updated of all unit tests to increase coverage of functional tests, while also increasing speed.
Renamed the showEquipoiseLabel argument of
plotPs() to showEquipoiseLabel.
Updated the trimByPs() function. This now supports
three different trimming functions as described in the
literature.
Added support for grid-with-gradient likelihood profiles. Use the
following arguments in createFitOutcomeModelArgs() to
use:
profileGrid = seq(log(0.1), log(10), length.out = 8),
profileBounds = NULLThis adds the gradient field to the
cm_likelihood_profile table when exporting to CSV.
Removed mention of legacy function
grepCovariateNames() from the vignette.
Added citation to the HADES paper to the package.
Dropped insertExportedResultsInSqlite(),
launchResultsViewerUsingSqlite, and
launchResultsViewer(). The
OhdsiShinyAppBuilder package should be used directly
instead.
Corrected the minDaysAtRisk argument. Days at risk
is now computed as end - start + 1 (end day inclusive).
Added a vignette showing the results schema.
Changed the data type of the
interaction_covariate_id field in the
cm_interaction_result table from INT to
BIGINT.
Fixed trimming by IPTW using trimFraction
argument.
Bugfixes:
Reorganized createPs() to be much more efficient for
large study populations (millions of patients).
Reorganized fitOutcomeModel() to be much more
efficient for large study populations (millions of patients).
Bugfixes:
Changes:
createPs() now checks if filtering of the covariate
data is necessary (either because subject have been removed from the
study population or because excludeCovariateIds or
includeCovariateIds was specified). If no filtering is
required, no extra copy of the covariate data data is created, saving IO
time.
Added minimumCaseCount argument to
createCohortMethodDataSimulationProfile().
Preparing for Andromeda 1.0.0: no longer assuming
Andromeda tables are sorted.
Changing dependency from ShinyAppBuilder to
OhdsiShinyAppBuilder.
Bugfixes:
Fixed NA covariate prevalences when calling
createCohortMethodDataSimulationProfile().
Added some optimization to createPs() to prevent
running out of memory for large data objects using Andromeda >=
1.0.0.
Changes:
Updating viewer code to work with newer versions of
OhdsiShinyModules and
ShinyAppBuilder.
Dropped uploadExportedResults() function.
The cohorts argument of
insertExportedResultsInSqlite() has column
cohortId renamed to
cohortDefinitionId.
Added computation of MDRR for logistic models.
Changes:
Bugfixes:
Fix enforceCellCount() applied to covariate balance
when all balance is NA.
Stopping fitting PS model early when either target or comparator is empty. Prevents error when target or comparator is empty, sampling is required, and Cyclops happens to fit a model instead of declaring ILL CONDITIONED.
Message after matching on PS now shows correct number of subjects remaining after matching.
Changes:
runCmAnalyses() with different analyses settings than those
used to create the files. Also cleaning the cache.Bugfixes:
Fixed bug in parsing covariate filter settings for balance.
Updated vignettes to use latest Capr
functions.
Changes:
The computeCovariateBalance() function now also
computes standardized difference of mean comparing cohorts before and
after PS adjustment, which can inform on generalizability.
Added the getGeneralizabilityTable()
function.
Improved computation of overall standard deviation when computing covariate balance (actually computing the SD instead of taking the mean of the target and comparator). Should produce more accurate balance estimations.
Generated population objects now keep track of likely target
estimator (e.g. ‘ATT’, or ‘ATE’). This informs selection of base
population when calling
getGeneralizabilityTable().
Deprecated the attritionFractionThreshold argument
of the createCmDiagnosticThresholds function, and instead
added the generalizabilitySdmThreshold argument.
The results schema specifications of the
exportToCsv() function has changed:
attrition_fraction and
attrition_diagnostic fields from the
cm_diagnostics_summary table.target_estimator field to the
cm_result add cm_interaction_result
tables.generalizability_max_sdm and
generalizabiltiy_diagnostic fields to the
cm_diagnostics_summary table.mean_before, mean_after,
target_std_diff, comparator_std_diff, and
target_comparator_std_diff fields to both the
cm_covariate_balance and
cm_shared_covariate_balance tables.Improve speed of covariate balance computation.
Adding one-sided (calibrated) p-values to results summary and results model.
Adding unblind_for_evidence_synthesis field to
cm_diagnostics_summary table.
The cm_diagnostics_summary table now also contains
negative controls.
Bugfixes:
Fixing runCmAnalyses() when using
refitPsForEveryOutcome = TRUE.
Handling edge case when exporting preference distribution and the target or comparator only has 1 subject.
Changes:
Bugfixes:
Fixing matching on PS and other covariates.
Now passing outcome-specific riskWindowEnd argument
in runCmAnalyses() when specified.
Fixed error when calling createStudyPopulation()
with “keep first” when there is only 1 person in the
population.
Changes:
Setting the default Cyclops control object to use
resetCoefficients = TRUE to ensure we always get the exact
same model, irrespective of the number of threads used.
Adding checking of user input to all functions.
Removing deprecated excludeDrugsFromCovariates
argument from getDbCohortMethodData() function.
Removing deprecated oracleTempSchema argument from
getDbCohortMethodData() and runCmAnalyses()
functions.
Removing deprecated addExposureDaysToStart and
addExposureDaysToStart arguments from
createStudyPopulation() and plotTimeToEvent()
functions.
The removeDuplicateSubjects argument of
getDbCohortMethodData() and
createStudyPopulation() is no longer allowed to be a
boolean.
Adding computeEquipoise() function.
Output likelihood profile as data frame instead of named vector for consistency with other HADES packages.
Added the covariateFilter argument to the
computeCovariateBalance function, to allow balance to be
computed only for a subset of covariates.
Rounding propensity scores to 10 digits to improve reproducibility across operating systems.
Setting covariateCohortDatabaseSchema and
covariateCohortTable of cohort-based covariate builders to
exposureDatabaseSchema and exposureTable,
respectively if covariateCohortTable is
NULL.
Now computing IPTW in createPs(), and truncating
IPTW can be done in truncateIptw(). The
computeCovariateBalance() function now computes balance
using IPTW if no stratumId column is found in the
population argument.
Removing PS of exactly 0 and exactly 1 when computing the standard deviation of the logit for the matching caliper to allow matching when some subjects have perfectly predictable treatment assignment.
Adding maxRows argument to
computePsAuc() function to improve speed for very large
study populations.
Dropping support for CDM v4.
Major overhaul of the multiple-analyses framework:
Added the createOutcome() function, to be used with
createTargetComparatorOutcomes(). This allow the
priorOutcomeLookback, riskWindowStart,
startAnchor, riskWindowEnd, and
endAnchor arguments to be specified per outcome. These
settings (if provided) will override the settings created using the
createCreateStudyPopulationArgs() function. In addition,
the createOutcome() function has an
outcomeOfInterest and trueEffectSize argument
(see below).
Added the createComputeCovariateBalanceArgs()
function, added the computeSharedCovariateBalance,
,computeSharedCovariateBalanceArgs,
computeCovariateBalance, and
computeCovariateBalanceArgs arguments to the
createCmAnalysis() function, and the
computeSharedBalanceThreads,
computeBalanceThreads arguments to the
runCmAnalyses() function to allow computation of covariate
balance across a target-comparator-analysis (shared) or for each
target-comparator-analysis-outcome in the runCmAnalyses()
function.
Dropping targetType and comparatorType
options from the createCmAnalysis() function, since the
notion of analysis-specific target and comparator selection strategies
can also be implemented using the analysesToExclude
argument of runCmAnalyses().
Dropping outcomeIdsOfInterest argument of the
runCmAnalyses() function. Instead, the
createOutcome() function now has a
outcomeOfInterest argument.
Settings related to multi-threading are combined in to a single
settings object that be created using the new
createCmMultiThreadingSettings() function.
Dropping prefilterCovariates from
runCmAnalyses(). Prefiltering is now always done when
specific covariates are used in the outcome model.
Removed the summarizeAnalyses() function. Instead,
results are automatically summarized in runCmAnalyses().
The summary can be retrieved using the new
getResultsSummary() and
getInteractionResultsSummary() functions. Empirical
calibration, MDRR, and attrition fraction are automatically
computed.
Changing case in output of getResultsSummary() from
ci95lb and ci95ub to ci95Lb and
ci95Ub.
Added empirical calibration to the
getResultsSummary() function. Controls can be identified by
the trueEffectSize argument in the
createOutcome() function.
Dropping arguments like createPs and
fitOutcomeModel from the createCmAnalysis()
function. Instead, not providing createPsArgs or
fitOutcomeModelArgs is assumed to mean skipping propensity
score creation or outcome model fitting, respectively.
Added the exportToCsv() function for exporting study
results to CSV files that do not contain patient-level information and
can therefore be shared between sites. The
getResultsDataModel() function returns the data model for
these CSV files.
Added the uploadExportedResults() and
insertExportedResultsInSqlite() functions for uploading the
results from the CSV files in a database. The
launchResultsViewer() and
launchResultsViewerUsingSqlite() functions were added for
launching a Shiny app to view the results in the (SQLite)
database.
Bug fixes:
maxWeight when
performing IPTW.Changes;
RISCA from the Suggests list. This package was
used for a single unit test, but has a large amount of
difficult-to-install dependencies.Bug fixes:
Cyclops version.Changes:
Added the analysesToExclude argument to
runCmAnalyses, allowing the users to specify
target-comparator-outcome-analysis combinations to exclude from
execution.
Output of computeCovariateBalance() now also
contains domainId and isBinary
columns.
Added plotCovariatePrevalence() function.
Bug fixes:
Fixed erroneous sample size reported for comparator cohorts when computing covariate balance. (the actual sample size was fine)
Fixed error when all analyses have
fitOutcomeModel = FALSE.
Fixed attrition counts when using
allowReverseMatch = TRUE
Changes:
Adding highlightExposedEvents and
includePostIndexTime arguments to
plotTimeToEvent().
Adding maxCohortSize argument to the
computeCovariateBalance() function. The target and
comparator cohorts will be downsampled if they are larger, speeding up
computation.
Bug fixes:
plotTimeToEvent() when there are
time periods in plot when nobody is observed.Changes:
Adding the trimByIptw() function.
Adding the estimator argument to the
fitOutcomeModel() function to select ‘ate’ (average
treatment effect) or ‘att’ (average treatment effect in the treated)
when performing IPTW.
Added the maxWeight argument to the
fitOutcomeModel() function. Weights greater than this value
will be set to this value.
Adding option to use adaptive likelihood profiling, and making this the default.
Adding maxDaysAtRisk argument to the
createStudyPopulation() and
createCreateStudyPopulationArgs() functions.
Bug fixes:
Fixing IPTW.
Fixing error when stratifying and base population is empty (but overall population is not).
Changes:
Dropped insertDbPopulation() function. This didn’t
seem to be used by anyone, and would have required carrying the person
ID throughout the pipeline.
Introducing new unique person identified called
personSeqId, generated during data extraction. Person ID is
now downloaded as string to avoid issues with 64-bit integers. Person ID
is not used by CohortMethod, and is provided for reference
only.
Adding log likelihood ratio to outcome model object.
Deprecating oracleTempSchema argument in favor of
tempEmulationSchema in line with new SqlRender
interface.
Bug fixes:
Still was not always including the likelihood profile in the outcome model objects.
Fixing issues when IDs are integer64.
Changes:
Bug fixes:
Fixing “argument ‘excludeDrugsFromCovariates’ is missing” error
when calling createGetDbCohortMethodDataArgs() without
deprecated argument excludeDrugsFromCovariates.
More testing and handling of empty exposure cohorts.
Fixing exclusion of covariate IDs when fitting propensity models.
Correct covariate balance computation when covariate values are integers.
Changes:
Bugfixes:
Changes:
Updating documentation: adding literature reference for IPTW, and using new SqlRender interface in vignettes.
Changing default equipoise bounds from 0.25-0.75 to 0.3-0.7 to be consistent with Alec Walker’s original paper.
Bugfixes:
Changes:
Added plotTimeToEvent function
Deprecating addExposureDaysToStart and addExposureDaysToEnd arguments, adding new arguments called startAnchor and endAnchor. The hope is this is less confusing.
Fixing random seeds for reproducibility.
Changing default equipoise bounds from 0.25-0.75 to 0.3-0.7 to be consistent with Alec Walker’s original paper.
Bugfixes:
No longer overriding ffmaxbytes and ffbatchbytes in .onLoad. Instead relying on FeatureExtraction to do that. Part of fixing chunk.default error caused by ff package on R v3.6.0 on machines with lots of memory.
Correct calculation in original population count when using study end date.
Changes: