Pesquisa | Portal Regional da BVS (teste)

1.

Comparing penalization methods for linear models on large observational health data.

Fridgeirsson, Egill A; Williams, Ross; Rijnbeek, Peter; Suchard, Marc A; Reps, Jenna M.

J Am Med Inform Assoc ; 31(7): 1514-1521, 2024 Jun 20.

Artigo em Inglês | MEDLINE | ID: mdl-38767857

RESUMO

OBJECTIVE: This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation. MATERIALS AND METHODS: We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams. RESULTS: Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity. CONCLUSION: L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.

Assuntos

Transtorno Depressivo Maior , Humanos , Modelos Logísticos , Registros Eletrônicos de Saúde , Modelos Lineares , Bases de Dados Factuais , Estados Unidos

2.

Similar Risk of Kidney Failure among Patients with Blinding Diseases Who Receive Ranibizumab, Aflibercept, and Bevacizumab: An Observational Health Data Sciences and Informatics Network Study.

Cai, Cindy X; Nishimura, Akihiko; Bowring, Mary G; Westlund, Erik; Tran, Diep; Ng, Jia H; Nagy, Paul; Cook, Michael; McLeggon, Jody-Ann; DuVall, Scott L; Matheny, Michael E; Golozar, Asieh; Ostropolets, Anna; Minty, Evan; Desai, Priya; Bu, Fan; Toy, Brian; Hribar, Michelle; Falconer, Thomas; Zhang, Linying; Lawrence-Archer, Laurence; Boland, Michael V; Goetz, Kerry; Hall, Nathan; Shoaibi, Azza; Reps, Jenna; Sena, Anthony G; Blacketer, Clair; Swerdel, Joel; Jhaveri, Kenar D; Lee, Edward; Gilbert, Zachary; Zeger, Scott L; Crews, Deidra C; Suchard, Marc A; Hripcsak, George; Ryan, Patrick B.

Ophthalmol Retina ; 2024 Mar 20.

Artigo em Inglês | MEDLINE | ID: mdl-38519026

RESUMO

PURPOSE: To characterize the incidence of kidney failure associated with intravitreal anti-VEGF exposure; and compare the risk of kidney failure in patients treated with ranibizumab, aflibercept, or bevacizumab. DESIGN: Retrospective cohort study across 12 databases in the Observational Health Data Sciences and Informatics (OHDSI) network. SUBJECTS: Subjects aged ≥ 18 years with ≥ 3 monthly intravitreal anti-VEGF medications for a blinding disease (diabetic retinopathy, diabetic macular edema, exudative age-related macular degeneration, or retinal vein occlusion). METHODS: The standardized incidence proportions and rates of kidney failure while on treatment with anti-VEGF were calculated. For each comparison (e.g., aflibercept versus ranibizumab), patients from each group were matched 1:1 using propensity scores. Cox proportional hazards models were used to estimate the risk of kidney failure while on treatment. A random effects meta-analysis was performed to combine each database's hazard ratio (HR) estimate into a single network-wide estimate. MAIN OUTCOME MEASURES: Incidence of kidney failure while on anti-VEGF treatment, and time from cohort entry to kidney failure. RESULTS: Of the 6.1 million patients with blinding diseases, 37 189 who received ranibizumab, 39 447 aflibercept, and 163 611 bevacizumab were included; the total treatment exposure time was 161 724 person-years. The average standardized incidence proportion of kidney failure was 678 per 100 000 persons (range, 0-2389), and incidence rate 742 per 100 000 person-years (range, 0-2661). The meta-analysis HR of kidney failure comparing aflibercept with ranibizumab was 1.01 (95% confidence interval [CI], 0.70-1.47; P = 0.45), ranibizumab with bevacizumab 0.95 (95% CI, 0.68-1.32; P = 0.62), and aflibercept with bevacizumab 0.95 (95% CI, 0.65-1.39; P = 0.60). CONCLUSIONS: There was no substantially different relative risk of kidney failure between those who received ranibizumab, bevacizumab, or aflibercept. Practicing ophthalmologists and nephrologists should be aware of the risk of kidney failure among patients receiving intravitreal anti-VEGF medications and that there is little empirical evidence to preferentially choose among the specific intravitreal anti-VEGF agents. FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

3.

Health-Analytics Data to Evidence Suite (HADES): Open-Source Software for Observational Research.

Schuemie, Martijn; Reps, Jenna; Black, Adam; Defalco, Frank; Evans, Lee; Fridgeirsson, Egill; Gilbert, James P; Knoll, Chris; Lavallee, Martin; Rao, Gowtham A; Rijnbeek, Peter; Sadowski, Katy; Sena, Anthony; Swerdel, Joel; Williams, Ross D; Suchard, Marc.

Stud Health Technol Inform ; 310: 966-970, 2024 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-38269952

RESUMO

The Health-Analytics Data to Evidence Suite (HADES) is an open-source software collection developed by Observational Health Data Sciences and Informatics (OHDSI). It executes directly against healthcare data such as electronic health records and administrative claims, that have been converted to the Observational Medical Outcomes Partnership (OMOP) Common Data Model. Using advanced analytics, HADES performs characterization, population-level causal effect estimation, and patient-level prediction, potentially across a federated data network, allowing patient-level data to remain locally while only aggregated statistics are shared. Designed to run across a wide array of technical environments, including different operating systems and database platforms, HADES uses continuous integration with a large set of unit tests to maintain reliability. HADES implements OHDSI best practices, and is used in almost all published OHDSI studies, including some that have directly informed regulatory decisions.

Assuntos

Ciência de Dados , Registros Eletrônicos de Saúde , Humanos , Bases de Dados Factuais , Reprodutibilidade dos Testes , Software , Estudos Observacionais como Assunto

4.

Privacy-Preserving Federated Model Predicting Bipolar Transition in Patients With Depression: Prediction Model Development Study.

Lee, Dong Yun; Choi, Byungjin; Kim, Chungsoo; Fridgeirsson, Egill; Reps, Jenna; Kim, Myoungsuk; Kim, Jihyeong; Jang, Jae-Won; Rhee, Sang Youl; Seo, Won-Woo; Lee, Seunghoon; Son, Sang Joon; Park, Rae Woong.

J Med Internet Res ; 25: e46165, 2023 07 20.

Artigo em Inglês | MEDLINE | ID: mdl-37471130

RESUMO

BACKGROUND: Mood disorder has emerged as a serious concern for public health; in particular, bipolar disorder has a less favorable prognosis than depression. Although prompt recognition of depression conversion to bipolar disorder is needed, early prediction is challenging due to overlapping symptoms. Recently, there have been attempts to develop a prediction model by using federated learning. Federated learning in medical fields is a method for training multi-institutional machine learning models without patient-level data sharing. OBJECTIVE: This study aims to develop and validate a federated, differentially private multi-institutional bipolar transition prediction model. METHODS: This retrospective study enrolled patients diagnosed with the first depressive episode at 5 tertiary hospitals in South Korea. We developed models for predicting bipolar transition by using data from 17,631 patients in 4 institutions. Further, we used data from 4541 patients for external validation from 1 institution. We created standardized pipelines to extract large-scale clinical features from the 4 institutions without any code modification. Moreover, we performed feature selection in a federated environment for computational efficiency and applied differential privacy to gradient updates. Finally, we compared the federated and the 4 local models developed with each hospital's data on internal and external validation data sets. RESULTS: In the internal data set, 279 out of 17,631 patients showed bipolar disorder transition. In the external data set, 39 out of 4541 patients showed bipolar disorder transition. The average performance of the federated model in the internal test (area under the curve [AUC] 0.726) and external validation (AUC 0.719) data sets was higher than that of the other locally developed models (AUC 0.642-0.707 and AUC 0.642-0.699, respectively). In the federated model, classifications were driven by several predictors such as the Charlson index (low scores were associated with bipolar transition, which may be due to younger age), severe depression, anxiolytics, young age, and visiting months (the bipolar transition was associated with seasonality, especially during the spring and summer months). CONCLUSIONS: We developed and validated a differentially private federated model by using distributed multi-institutional psychiatric data with standardized pipelines in a real-world environment. The federated model performed better than models using local data only.

Assuntos

Transtorno Bipolar , Aprendizado de Máquina , Privacidade , Humanos , Transtorno Bipolar/diagnóstico , Depressão/diagnóstico , Transtornos do Humor , Estudos Retrospectivos

5.

Correction to: Adaptation and validation of a coding algorithm for the Charlson Comorbidity Index in administrative claims data using the SNOMED CT standardized vocabulary.

Fortin, Stephen P; Reps, Jenna; Ryan, Patrick.

BMC Med Inform Decis Mak ; 23(1): 109, 2023 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-37322462

6.

Does Using a Stacking Ensemble Method to Combine Multiple Base Learners Within a Database Improve Model Transportability?

Yang, Cynthia; Fridgeirsson, Egill A; Kors, Jan A; Reps, Jenna M; Rijnbeek, Peter R; Wong, Jenna; Williams, Ross D.

Stud Health Technol Inform ; 302: 129-130, 2023 May 18.

Artigo em Inglês | MEDLINE | ID: mdl-37203625

RESUMO

We investigated a stacking ensemble method that combines multiple base learners within a database. The results on external validation across four large databases suggest a stacking ensemble could improve model transportability.

Assuntos

Bases de Dados Factuais

7.

The DELPHI Library: Improving Model Validation, Transparency and Dissemination Through a Centralised Library of Prediction Models.

Williams, Ross D; den Otter, Sicco; Reps, Jenna M; Rijnbeek, Peter R.

Stud Health Technol Inform ; 302: 139-140, 2023 May 18.

Artigo em Inglês | MEDLINE | ID: mdl-37203630

RESUMO

The Deposit, Evaluate and Lookup Predictive Healthcare Information (DELPHI) library provides a centralised location for the depositing, exploring and analysing of patient-level prediction models that are compatible with data mapped to the observational medical outcomes partnership common data model.

8.

Machine Learning and Real-World Data to Predict Lung Cancer Risk in Routine Care.

Chandran, Urmila; Reps, Jenna; Yang, Robert; Vachani, Anil; Maldonado, Fabien; Kalsekar, Iftekhar.

Cancer Epidemiol Biomarkers Prev ; 32(3): 337-343, 2023 03 06.

Artigo em Inglês | MEDLINE | ID: mdl-36576991

RESUMO

BACKGROUND: This study used machine learning to develop a 3-year lung cancer risk prediction model with large real-world data in a mostly younger population. METHODS: Over 4.7 million individuals, aged 45 to 65 years with no history of any cancer or lung cancer screening, diagnostic, or treatment procedures, with an outpatient visit in 2013 were identified in Optum's de-identified Electronic Health Record (EHR) dataset. A least absolute shrinkage and selection operator model was fit using all available data in the 365 days prior. Temporal validation was assessed with recent data. External validation was assessed with data from Mercy Health Systems EHR and Optum's de-identified Clinformatics Data Mart Database. Racial inequities in model discrimination were assessed with xAUCs. RESULTS: The model AUC was 0.76. Top predictors included age, smoking, race, ethnicity, and diagnosis of chronic obstructive pulmonary disease. The model identified a high-risk group with lung cancer incidence 9 times the average cohort incidence, representing 10% of patients with lung cancer. Model performed well temporally and externally, while performance was reduced for Asians and Hispanics. CONCLUSIONS: A high-dimensional model trained using big data identified a subset of patients with high lung cancer risk. The model demonstrated transportability to EHR and claims data, while underscoring the need to assess racial disparities when using machine learning methods. IMPACT: This internally and externally validated real-world data-based lung cancer prediction model is available on an open-source platform for broad sharing and application. Model integration into an EHR system could minimize physician burden by automating identification of high-risk patients.

Assuntos

Neoplasias Pulmonares , Doença Pulmonar Obstrutiva Crônica , Humanos , Detecção Precoce de Câncer , Incidência , Aprendizado de Máquina , Registros Eletrônicos de Saúde

9.

External validation of existing dementia prediction models on observational health data.

John, Luis H; Kors, Jan A; Fridgeirsson, Egill A; Reps, Jenna M; Rijnbeek, Peter R.

BMC Med Res Methodol ; 22(1): 311, 2022 12 05.

Artigo em Inglês | MEDLINE | ID: mdl-36471238

RESUMO

BACKGROUND: Many dementia prediction models have been developed, but only few have been externally validated, which hinders clinical uptake and may pose a risk if models are applied to actual patients regardless. Externally validating an existing prediction model is a difficult task, where we mostly rely on the completeness of model reporting in a published article. In this study, we aim to externally validate existing dementia prediction models. To that end, we define model reporting criteria, review published studies, and externally validate three well reported models using routinely collected health data from administrative claims and electronic health records. METHODS: We identified dementia prediction models that were developed between 2011 and 2020 and assessed if they could be externally validated given a set of model criteria. In addition, we externally validated three of these models (Walters' Dementia Risk Score, Mehta's RxDx-Dementia Risk Index, and Nori's ADRD dementia prediction model) on a network of six observational health databases from the United States, United Kingdom, Germany and the Netherlands, including the original development databases of the models. RESULTS: We reviewed 59 dementia prediction models. All models reported the prediction method, development database, and target and outcome definitions. Less frequently reported by these 59 prediction models were predictor definitions (52 models) including the time window in which a predictor is assessed (21 models), predictor coefficients (20 models), and the time-at-risk (42 models). The validation of the model by Walters (development c-statistic: 0.84) showed moderate transportability (0.67-0.76 c-statistic). The Mehta model (development c-statistic: 0.81) transported well to some of the external databases (0.69-0.79 c-statistic). The Nori model (development AUROC: 0.69) transported well (0.62-0.68 AUROC) but performed modestly overall. Recalibration showed improvements for the Walters and Nori models, while recalibration could not be assessed for the Mehta model due to unreported baseline hazard. CONCLUSION: We observed that reporting is mostly insufficient to fully externally validate published dementia prediction models, and therefore, it is uncertain how well these models would work in other clinical settings. We emphasize the importance of following established guidelines for reporting clinical prediction models. We recommend that reporting should be more explicit and have external validation in mind if the model is meant to be applied in different settings.

Assuntos

Demência , Humanos , Reino Unido , Fatores de Risco , Demência/diagnóstico , Demência/epidemiologia , Países Baixos/epidemiologia , Alemanha , Prognóstico

10.

Adaptation and validation of a coding algorithm for the Charlson Comorbidity Index in administrative claims data using the SNOMED CT standardized vocabulary.

Fortin, Stephen P; Reps, Jenna; Ryan, Patrick.

BMC Med Inform Decis Mak ; 22(1): 261, 2022 10 07.

Artigo em Inglês | MEDLINE | ID: mdl-36207711

RESUMO

OBJECTIVES: The Charlson comorbidity index (CCI), the most ubiquitous comorbid risk score, predicts one-year mortality among hospitalized patients and provides a single aggregate measure of patient comorbidity. The Quan adaptation of the CCI revised the CCI coding algorithm for applications to administrative claims data using the International Classification of Diseases (ICD). The purpose of the current study is to adapt and validate a coding algorithm for the CCI using the SNOMED CT standardized vocabulary, one of the most commonly used vocabularies for data collection in healthcare databases in the U.S. METHODS: The SNOMED CT coding algorithm for the CCI was adapted through the direct translation of the Quan coding algorithms followed by manual curation by clinical experts. The performance of the SNOMED CT and Quan coding algorithms were compared in the context of a retrospective cohort study of inpatient visits occurring during the calendar years of 2013 and 2018 contained in two U.S. administrative claims databases. Differences in the CCI or frequency of individual comorbid conditions were assessed using standardized mean differences (SMD). Performance in predicting one-year mortality among hospitalized patients was measured based on the c-statistic of logistic regression models. RESULTS: For each database and calendar year combination, no significant differences in the CCI or frequency of individual comorbid conditions were observed between vocabularies (SMD ≤ 0.10). Specifically, the difference in CCI measured using the SNOMED CT vs. Quan coding algorithms was highest in MDCD in 2013 (3.75 vs. 3.6; SMD = 0.03) and lowest in DOD in 2018 (3.93 vs. 3.86; SMD = 0.02). Similarly, as indicated by the c-statistic, there was no evidence of a difference in the performance between coding algorithms in predicting one-year mortality (SNOMED CT vs. Quan coding algorithms, range: 0.725-0.789 vs. 0.723-0.787, respectively). A total of 700 of 5,348 (13.1%) ICD code mappings were inconsistent between coding algorithms. The most common cause of discrepant codes was multiple ICD codes mapping to a SNOMED CT code (n = 560) of which 213 were deemed clinically relevant thereby leading to information gain. CONCLUSION: The current study repurposed an important tool for conducting observational research to use the SNOMED CT standardized vocabulary.

Assuntos

Systematized Nomenclature of Medicine , Vocabulário , Algoritmos , Comorbidade , Humanos , Classificação Internacional de Doenças , Estudos Retrospectivos

11.

International cohort study indicates no association between alpha-1 blockers and susceptibility to COVID-19 in benign prostatic hyperplasia patients.

Nishimura, Akihiko; Xie, Junqing; Kostka, Kristin; Duarte-Salles, Talita; Fernández Bertolín, Sergio; Aragón, María; Blacketer, Clair; Shoaibi, Azza; DuVall, Scott L; Lynch, Kristine; Matheny, Michael E; Falconer, Thomas; Morales, Daniel R; Conover, Mitchell M; Chan You, Seng; Pratt, Nicole; Weaver, James; Sena, Anthony G; Schuemie, Martijn J; Reps, Jenna; Reich, Christian; Rijnbeek, Peter R; Ryan, Patrick B; Hripcsak, George; Prieto-Alhambra, Daniel; Suchard, Marc A.

Front Pharmacol ; 13: 945592, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36188566

RESUMO

Purpose: Alpha-1 blockers, often used to treat benign prostatic hyperplasia (BPH), have been hypothesized to prevent COVID-19 complications by minimising cytokine storm release. The proposed treatment based on this hypothesis currently lacks support from reliable real-world evidence, however. We leverage an international network of large-scale healthcare databases to generate comprehensive evidence in a transparent and reproducible manner. Methods: In this international cohort study, we deployed electronic health records from Spain (SIDIAP) and the United States (Department of Veterans Affairs, Columbia University Irving Medical Center, IQVIA OpenClaims, Optum DOD, Optum EHR). We assessed association between alpha-1 blocker use and risks of three COVID-19 outcomes-diagnosis, hospitalization, and hospitalization requiring intensive services-using a prevalent-user active-comparator design. We estimated hazard ratios using state-of-the-art techniques to minimize potential confounding, including large-scale propensity score matching/stratification and negative control calibration. We pooled database-specific estimates through random effects meta-analysis. Results: Our study overall included 2.6 and 0.46 million users of alpha-1 blockers and of alternative BPH medications. We observed no significant difference in their risks for any of the COVID-19 outcomes, with our meta-analytic HR estimates being 1.02 (95% CI: 0.92-1.13) for diagnosis, 1.00 (95% CI: 0.89-1.13) for hospitalization, and 1.15 (95% CI: 0.71-1.88) for hospitalization requiring intensive services. Conclusion: We found no evidence of the hypothesized reduction in risks of the COVID-19 outcomes from the prevalent-use of alpha-1 blockers-further research is needed to identify effective therapies for this novel disease.

12.

Development of multivariable models to predict perinatal depression before and after delivery using patient reported survey responses at weeks 4-10 of pregnancy.

Reps, Jenna M; Wilcox, Marsha; McGee, Beth Ann; Leonte, Marie; LaCross, Lauren; Wildenhaus, Kevin.

BMC Pregnancy Childbirth ; 22(1): 442, 2022 May 26.

Artigo em Inglês | MEDLINE | ID: mdl-35619056

RESUMO

BACKGROUND: Perinatal depression is estimated to affect ~ 12% of pregnancies and is linked to numerous negative outcomes. There is currently no model to predict perinatal depression at multiple time-points during and after pregnancy using variables ascertained early into pregnancy. METHODS: A prospective cohort design where 858 participants filled in a baseline self-reported survey at week 4-10 of pregnancy (that included social economics, health history, various psychiatric measures), with follow-up until 3 months after delivery. Our primary outcome was an Edinburgh Postnatal Depression Score (EPDS) score of 12 or more (a proxy for perinatal depression) assessed during each trimester and again at two time periods after delivery. Five gradient boosting machines were trained to predict the risk of having EPDS score > = 12 at each of the five follow-up periods. The predictors consisted of 21 variables from 3 validated psychometric scales. As a sensitivity analysis, we also investigated different predictor sets that contained: i) 17 of the 21 variables predictors by only including two of the psychometric scales and ii) including 143 additional social economics and health history predictors, resulting in 164 predictors. RESULTS: We developed five prognostic models: PND-T1 (trimester 1), PND-T2 (trimester 2), PND-T3 (trimester 3), PND-A1 (after delivery 1) and PND-A2 (delayed onset after delivery) that calculate personalised risks while only requiring that women be asked 21 questions from 3 validated psychometric scales at weeks 4-10 of pregnancy. C-statistics (also known as AUC) ranged between 0.69 (95% CI 0.65-0.73) and 0.77 (95% CI 0.74-0.80). At 50% sensitivity the positive predictive value ranged between 30%-50% across the models, generally identifying groups of patients with double the average risk. Models trained using the 17 predictors and 164 predictors did not improve model performance compared to the models trained using 21 predictors. CONCLUSIONS: The five models can predict risk of perinatal depression within each trimester and in two post-natal periods using survey responses as early as week 4 of pregnancy with modest performance. The models need to be externally validated and prospectively tested to ensure generalizability to any pregnant patient.

Assuntos

Depressão Pós-Parto , Transtorno Depressivo , Depressão/diagnóstico , Depressão/psicologia , Depressão Pós-Parto/psicologia , Feminino , Humanos , Medidas de Resultados Relatados pelo Paciente , Gravidez , Estudos Prospectivos

13.

Learning patient-level prediction models across multiple healthcare databases: evaluation of ensembles for increasing model transportability.

Reps, Jenna Marie; Williams, Ross D; Schuemie, Martijn J; Ryan, Patrick B; Rijnbeek, Peter R.

BMC Med Inform Decis Mak ; 22(1): 142, 2022 05 25.

Artigo em Inglês | MEDLINE | ID: mdl-35614485

RESUMO

BACKGROUND: Prognostic models that are accurate could help aid medical decision making. Large observational databases often contain temporal medical data for large and diverse populations of patients. It may be possible to learn prognostic models using the large observational data. Often the performance of a prognostic model undesirably worsens when transported to a different database (or into a clinical setting). In this study we investigate different ensemble approaches that combine prognostic models independently developed using different databases (a simple federated learning approach) to determine whether ensembles that combine models developed across databases can improve model transportability (perform better in new data than single database models)? METHODS: For a given prediction question we independently trained five single database models each using a different observational healthcare database. We then developed and investigated numerous ensemble models (fusion, stacking and mixture of experts) that combined the different database models. Performance of each model was investigated via discrimination and calibration using a leave one dataset out technique, i.e., hold out one database to use for validation and use the remaining four datasets for model development. The internal validation of a model developed using the hold out database was calculated and presented as the 'internal benchmark' for comparison. RESULTS: In this study the fusion ensembles generally outperformed the single database models when transported to a previously unseen database and the performances were more consistent across unseen databases. Stacking ensembles performed poorly in terms of discrimination when the labels in the unseen database were limited. Calibration was consistently poor when both ensembles and single database models were applied to previously unseen databases. CONCLUSION: A simple federated learning approach that implements ensemble techniques to combine models independently developed across different databases for the same prediction question may improve the discriminative performance in new data (new database or clinical setting) but will need to be recalibrated using the new data. This could help medical decision making by improving prognostic model performance.

Assuntos

Atenção à Saúde , Calibragem , Bases de Dados Factuais , Humanos , Prognóstico

14.

Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations.

Wong, Jenna; Prieto-Alhambra, Daniel; Rijnbeek, Peter R; Desai, Rishi J; Reps, Jenna M; Toh, Sengwee.

Drug Saf ; 45(5): 493-510, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35579813

RESUMO

Increasing availability of electronic health databases capturing real-world experiences with medical products has garnered much interest in their use for pharmacoepidemiologic and pharmacovigilance studies. The traditional practice of having numerous groups use single databases to accomplish similar tasks and address common questions about medical products can be made more efficient through well-coordinated multi-database studies, greatly facilitated through distributed data network (DDN) architectures. Access to larger amounts of electronic health data within DDNs has created a growing interest in using data-adaptive machine learning (ML) techniques that can automatically model complex associations in high-dimensional data with minimal human guidance. However, the siloed storage and diverse nature of the databases in DDNs create unique challenges for using ML. In this paper, we discuss opportunities, challenges, and considerations for applying ML in DDNs for pharmacoepidemiologic and pharmacovigilance studies. We first discuss major types of activities performed by DDNs and how ML may be used. Next, we discuss practical data-related factors influencing how DDNs work in practice. We then combine these discussions and jointly consider how opportunities for ML are affected by practical data-related factors for DDNs, leading to several challenges. We present different approaches for addressing these challenges and highlight efforts that real-world DDNs have taken or are currently taking to help mitigate them. Despite these challenges, the time is ripe for the emerging interest to use ML in DDNs, and the utility of these data-adaptive modeling techniques in pharmacoepidemiologic and pharmacovigilance studies will likely continue to increase in the coming years.

Assuntos

Aprendizado de Máquina , Farmacovigilância , Bases de Dados Factuais , Humanos , Farmacoepidemiologia

15.

Using Iterative Pairwise External Validation to Contextualize Prediction Model Performance: A Use Case Predicting 1-Year Heart Failure Risk in Patients with Diabetes Across Five Data Sources.

Williams, Ross D; Reps, Jenna M; Kors, Jan A; Ryan, Patrick B; Steyerberg, Ewout; Verhamme, Katia M; Rijnbeek, Peter R.

Drug Saf ; 45(5): 563-570, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35579818

RESUMO

INTRODUCTION: External validation of prediction models is increasingly being seen as a minimum requirement for acceptance in clinical practice. However, the lack of interoperability of healthcare databases has been the biggest barrier to this occurring on a large scale. Recent improvements in database interoperability enable a standardized analytical framework for model development and external validation. External validation of a model in a new database lacks context, whereby the external validation can be compared with a benchmark in this database. Iterative pairwise external validation (IPEV) is a framework that uses a rotating model development and validation approach to contextualize the assessment of performance across a network of databases. As a use case, we predicted 1-year risk of heart failure in patients with type 2 diabetes mellitus. METHODS: The method follows a two-step process involving (1) development of baseline and data-driven models in each database according to best practices and (2) validation of these models across the remaining databases. We introduce a heatmap visualization that supports the assessment of the internal and external model performance in all available databases. As a use case, we developed and validated models to predict 1-year risk of heart failure in patients initializing a second pharmacological intervention for type 2 diabetes mellitus. We leveraged the power of the Observational Medical Outcomes Partnership common data model to create an open-source software package to increase the consistency, speed, and transparency of this process. RESULTS: A total of 403,187 patients from five databases were included in the study. We developed five models that, when assessed internally, had a discriminative performance ranging from 0.73 to 0.81 area under the receiver operating characteristic curve with acceptable calibration. When we externally validated these models in a new database, three models achieved consistent performance and in context often performed similarly to models developed in the database itself. The visualization of IPEV provided valuable insights. From this, we identified the model developed in the Commercial Claims and Encounters (CCAE) database as the best performing model overall. CONCLUSION: Using IPEV lends weight to the model development process. The rotation of development through multiple databases provides context to model assessment, leading to improved understanding of transportability and generalizability. The inclusion of a baseline model in all modelling steps provides further context to the performance gains of increasing model complexity. The CCAE model was identified as a candidate for clinical use. The use case demonstrates that IPEV provides a huge opportunity in a new era of standardised data and analytics to improve insight into and trust in prediction models at an unprecedented scale.

Assuntos

Diabetes Mellitus Tipo 2 , Insuficiência Cardíaca , Bases de Dados Factuais , Diabetes Mellitus Tipo 2/epidemiologia , Insuficiência Cardíaca/epidemiologia , Humanos , Software

16.

Logistic regression models for patient-level prediction based on massive observational data: Do we need all data?

John, Luis H; Kors, Jan A; Reps, Jenna M; Ryan, Patrick B; Rijnbeek, Peter R.

Int J Med Inform ; 163: 104762, 2022 07.

Artigo em Inglês | MEDLINE | ID: mdl-35429722

RESUMO

OBJECTIVE: Provide guidance on sample size considerations for developing predictive models by empirically establishing the adequate sample size, which balances the competing objectives of improving model performance and reducing model complexity as well as computational requirements. MATERIALS AND METHODS: We empirically assess the effect of sample size on prediction performance and model complexity by generating learning curves for 81 prediction problems (23 outcomes predicted in a depression cohort, 58 outcomes predicted in a hypertension cohort) in three large observational health databases, requiring training of 17,248 prediction models. The adequate sample size was defined as the sample size for which the performance of a model equalled the maximum model performance minus a small threshold value. RESULTS: The adequate sample size achieves a median reduction of the number of observations of 9.5%, 37.3%, 58.5%, and 78.5% for the thresholds of 0.001, 0.005, 0.01, and 0.02, respectively. The median reduction of the number of predictors in the models was 8.6%, 32.2%, 48.2%, and 68.3% for the thresholds of 0.001, 0.005, 0.01, and 0.02, respectively. DISCUSSION: Based on our results a conservative, yet significant, reduction in sample size and model complexity can be estimated for future prediction work. Though, if a researcher is willing to generate a learning curve a much larger reduction of the model complexity may be possible as suggested by a large outcome-dependent variability. CONCLUSION: Our results suggest that in most cases only a fraction of the available data was sufficient to produce a model close to the performance of one developed on the full data set, but with a substantially reduced model complexity.

Assuntos

Modelos Logísticos , Estudos de Coortes , Humanos , Tamanho da Amostra

17.

DLMM as a lossless one-shot algorithm for collaborative multi-site distributed linear mixed models.

Luo, Chongliang; Islam, Md Nazmul; Sheils, Natalie E; Buresh, John; Reps, Jenna; Schuemie, Martijn J; Ryan, Patrick B; Edmondson, Mackenzie; Duan, Rui; Tong, Jiayi; Marks-Anglin, Arielle; Bian, Jiang; Chen, Zhaoyi; Duarte-Salles, Talita; Fernández-Bertolín, Sergio; Falconer, Thomas; Kim, Chungsoo; Park, Rae Woong; Pfohl, Stephen R; Shah, Nigam H; Williams, Andrew E; Xu, Hua; Zhou, Yujia; Lautenbach, Ebbing; Doshi, Jalpa A; Werner, Rachel M; Asch, David A; Chen, Yong.

Nat Commun ; 13(1): 1678, 2022 03 30.

Artigo em Inglês | MEDLINE | ID: mdl-35354802

RESUMO

Linear mixed models are commonly used in healthcare-based association analyses for analyzing multi-site data with heterogeneous site-specific random effects. Due to regulations for protecting patients' privacy, sensitive individual patient data (IPD) typically cannot be shared across sites. We propose an algorithm for fitting distributed linear mixed models (DLMMs) without sharing IPD across sites. This algorithm achieves results identical to those achieved using pooled IPD from multiple sites (i.e., the same effect size and standard error estimates), hence demonstrating the lossless property. The algorithm requires each site to contribute minimal aggregated data in only one round of communication. We demonstrate the lossless property of the proposed DLMM algorithm by investigating the associations between demographic and clinical characteristics and length of hospital stay in COVID-19 patients using administrative claims from the UnitedHealth Group Clinical Discovery Database. We extend this association study by incorporating 120,609 COVID-19 patients from 11 collaborative data sources worldwide.

Assuntos

COVID-19 , Algoritmos , COVID-19/epidemiologia , Confidencialidade , Bases de Dados Factuais , Humanos , Modelos Lineares

18.

Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network.

Williams, Ross D; Markus, Aniek F; Yang, Cynthia; Duarte-Salles, Talita; DuVall, Scott L; Falconer, Thomas; Jonnagaddala, Jitendra; Kim, Chungsoo; Rho, Yeunsook; Williams, Andrew E; Machado, Amanda Alberga; An, Min Ho; Aragón, María; Areia, Carlos; Burn, Edward; Choi, Young Hwa; Drakos, Iannis; Abrahão, Maria Tereza Fernandes; Fernández-Bertolín, Sergio; Hripcsak, George; Kaas-Hansen, Benjamin Skov; Kandukuri, Prasanna L; Kors, Jan A; Kostka, Kristin; Liaw, Siaw-Teng; Lynch, Kristine E; Machnicki, Gerardo; Matheny, Michael E; Morales, Daniel; Nyberg, Fredrik; Park, Rae Woong; Prats-Uribe, Albert; Pratt, Nicole; Rao, Gowtham; Reich, Christian G; Rivera, Marcela; Seinen, Tom; Shoaibi, Azza; Spotnitz, Matthew E; Steyerberg, Ewout W; Suchard, Marc A; You, Seng Chan; Zhang, Lin; Zhou, Lili; Ryan, Patrick B; Prieto-Alhambra, Daniel; Reps, Jenna M; Rijnbeek, Peter R.

BMC Med Res Methodol ; 22(1): 35, 2022 01 30.

Artigo em Inglês | MEDLINE | ID: mdl-35094685

RESUMO

BACKGROUND: We investigated whether we could use influenza data to develop prediction models for COVID-19 to increase the speed at which prediction models can reliably be developed and validated early in a pandemic. We developed COVID-19 Estimated Risk (COVER) scores that quantify a patient's risk of hospital admission with pneumonia (COVER-H), hospitalization with pneumonia requiring intensive services or death (COVER-I), or fatality (COVER-F) in the 30-days following COVID-19 diagnosis using historical data from patients with influenza or flu-like symptoms and tested this in COVID-19 patients. METHODS: We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date. RESULTS: Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69-0.81, COVER-I: 0.73-0.91, and COVER-F: 0.72-0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations. CONCLUSIONS: This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use.

Assuntos

COVID-19 , Influenza Humana , Pneumonia , Teste para COVID-19 , Humanos , Influenza Humana/epidemiologia , SARS-CoV-2 , Estados Unidos

19.

90-Day all-cause mortality can be predicted following a total knee replacement: an international, network study to develop and validate a prediction model.

Williams, Ross D; Reps, Jenna M; Rijnbeek, Peter R; Ryan, Patrick B; Prieto-Alhambra, Daniel.

Knee Surg Sports Traumatol Arthrosc ; 30(9): 3068-3075, 2022 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-34870731

RESUMO

PURPOSE: The purpose of this study was to develop and validate a prediction model for 90-day mortality following a total knee replacement (TKR). TKR is a safe and cost-effective surgical procedure for treating severe knee osteoarthritis (OA). Although complications following surgery are rare, prediction tools could help identify high-risk patients who could be targeted with preventative interventions. The aim was to develop and validate a simple model to help inform treatment choices. METHODS: A mortality prediction model for knee OA patients following TKR was developed and externally validated using a US claims database and a UK general practice database. The target population consisted of patients undergoing a primary TKR for knee OA, aged ≥ 40 years and registered for ≥ 1 year before surgery. LASSO logistic regression models were developed for post-operative (90-day) mortality. A second mortality model was developed with a reduced feature set to increase interpretability and usability. RESULTS: A total of 193,615 patients were included, with 40,950 in The Health Improvement Network (THIN) database and 152,665 in Optum. The full model predicting 90-day mortality yielded AUROC of 0.78 when trained in OPTUM and 0.70 when externally validated on THIN. The 12 variable model achieved internal AUROC of 0.77 and external AUROC of 0.71 in THIN. CONCLUSIONS: A simple prediction model based on sex, age, and 10 comorbidities that can identify patients at high risk of short-term mortality following TKR was developed that demonstrated good, robust performance. The 12-feature mortality model is easily implemented and the performance suggests it could be used to inform evidence based shared decision-making prior to surgery and targeting prophylaxis for those at high risk. LEVEL OF EVIDENCE: III.

Assuntos

Artroplastia do Joelho , Osteoartrite do Joelho , Criança , Bases de Dados Factuais , Humanos

20.

Investigating the impact of development and internal validation design when training prognostic models using a retrospective cohort in big US observational healthcare data.

Reps, Jenna M; Ryan, Patrick; Rijnbeek, P R.

BMJ Open ; 11(12): e050146, 2021 12 24.

Artigo em Inglês | MEDLINE | ID: mdl-34952871

RESUMO

OBJECTIVE: The internal validation of prediction models aims to quantify the generalisability of a model. We aim to determine the impact, if any, that the choice of development and internal validation design has on the internal performance bias and model generalisability in big data (n~500 000). DESIGN: Retrospective cohort. SETTING: Primary and secondary care; three US claims databases. PARTICIPANTS: 1 200 769 patients pharmaceutically treated for their first occurrence of depression. METHODS: We investigated the impact of the development/validation design across 21 real-world prediction questions. Model discrimination and calibration were assessed. We trained LASSO logistic regression models using US claims data and internally validated the models using eight different designs: 'no test/validation set', 'test/validation set' and cross validation with 3-fold, 5-fold or 10-fold with and without a test set. We then externally validated each model in two new US claims databases. We estimated the internal validation bias per design by empirically comparing the differences between the estimated internal performance and external performance. RESULTS: The differences between the models' internal estimated performances and external performances were largest for the 'no test/validation set' design. This indicates even with large data the 'no test/validation set' design causes models to overfit. The seven alternative designs included some validation process to select the hyperparameters and a fair testing process to estimate internal performance. These designs had similar internal performance estimates and performed similarly when externally validated in the two external databases. CONCLUSIONS: Even with big data, it is important to use some validation process to select the optimal hyperparameters and fairly assess internal validation using a test set or cross-validation.

Assuntos

Atenção à Saúde , Viés , Humanos , Modelos Logísticos , Prognóstico , Estudos Retrospectivos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA