The U.S. Food and Drug Administration's Sentinel Initiative "modular programs" have been shown to replicate findings from conventional protocol-driven, custom-programmed studies. One such parallel assessment-dabigatran and warfarin and selected outcomes-produced concordant findings for three of four study outcomes. The effect estimates and confidence intervals for the fourth-acute myocardial infarction-had more variability as compared with other outcomes. This paper evaluates the potential sources of that variability that led to unexpected divergence in findings.
We systematically compared the two studies and evaluated programming differences and their potential impact using a different dataset that allowed more granular data access for investigation. We reviewed the output at each of five main processing steps common in both study programs: cohort identification, propensity score estimation, propensity score matching, patient follow-up, and risk estimation.
Our findings point to several design features that warrant greater investigator attention when performing observational database studies: (a) treatment of recorded events (eg, diagnoses, procedures, and dispensings) co-occurring on the index date of study drug dispensing in cohort eligibility criteria and propensity score estimation and (b) construction of treatment episodes for study drugs of interest that have more complex dispensing patterns.
More precise and unambiguous operational definitions of all study parameters will increase transparency and reproducibility in observational database studies.