Search | VHL Regional Portal

1.

An operational guide to translational clinical machine learning in academic medical centers.

Poddar, Mukund; Marwaha, Jayson S; Yuan, William; Romero-Brufau, Santiago; Brat, Gabriel A.

NPJ Digit Med ; 7(1): 129, 2024 May 17.

Article in English | MEDLINE | ID: mdl-38760407

ABSTRACT

Few published data science tools are ever translated from academia to real-world clinical settings for which they were intended. One dimension of this problem is the software engineering task of turning published academic projects into tools that are usable at the bedside. Given the complexity of the data ecosystem in large health systems, this task often represents a significant barrier to the real-world deployment of data science tools for prospective piloting and evaluation. Many information technology companies have created Machine Learning Operations (MLOps) teams to help with such tasks at scale, but the low penetration of home-grown data science tools in regular clinical practice precludes the formation of such teams in healthcare organizations. Based on experiences deploying data science tools at two large academic medical centers (Beth Israel Deaconess Medical Center, Boston, MA; Mayo Clinic, Rochester, MN), we propose a strategy to facilitate this transition from academic product to operational tool, defining the responsibilities of the principal investigator, data scientist, machine learning engineer, health system IT administrator, and clinician end-user throughout the process. We first enumerate the technical resources and stakeholders needed to prepare for model deployment. We then propose an approach to planning how the final product will work from data extraction and analysis to visualization of model outputs. Finally, we describe how the team should execute on this plan. We hope to guide health systems aiming to deploy minimum viable data science tools and realize their value in clinical practice.

2.

Neurological diagnoses in hospitalized COVID-19 patients associated with adverse outcomes: A multinational cohort study.

Hutch, Meghan R; Son, Jiyeon; Le, Trang T; Hong, Chuan; Wang, Xuan; Shakeri Hossein Abad, Zahra; Morris, Michele; Gutiérrez-Sacristán, Alba; Klann, Jeffrey G; Spiridou, Anastasia; Batugo, Ashley; Bellazzi, Riccardo; Benoit, Vincent; Bonzel, Clara-Lea; Bryant, William A; Chiudinelli, Lorenzo; Cho, Kelly; Das, Priyam; González González, Tomás; Hanauer, David A; Henderson, Darren W; Ho, Yuk-Lam; Loh, Ne Hooi Will; Makoudjou, Adeline; Makwana, Simran; Malovini, Alberto; Moal, Bertrand; Mowery, Danielle L; Neuraz, Antoine; Samayamuthu, Malarkodi Jebathilagam; Sanz Vidorreta, Fernando J; Schriver, Emily R; Schubert, Petra; Talbert, Jeffery; Tan, Amelia L M; Tan, Byorn W L; Tan, Bryce W Q; Tibollo, Valentina; Tippman, Patric; Verdy, Guillaume; Yuan, William; Avillach, Paul; Gehlenborg, Nils; Omenn, Gilbert S; Visweswaran, Shyam; Cai, Tianxi; Luo, Yuan; Xia, Zongqi.

PLOS Digit Health ; 3(4): e0000484, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38620037

ABSTRACT

Few studies examining the patient outcomes of concurrent neurological manifestations during acute COVID-19 leveraged multinational cohorts of adults and children or distinguished between central and peripheral nervous system (CNS vs. PNS) involvement. Using a federated multinational network in which local clinicians and informatics experts curated the electronic health records data, we evaluated the risk of prolonged hospitalization and mortality in hospitalized COVID-19 patients from 21 healthcare systems across 7 countries. For adults, we used a federated learning approach whereby we ran Cox proportional hazard models locally at each healthcare system and performed a meta-analysis on the aggregated results to estimate the overall risk of adverse outcomes across our geographically diverse populations. For children, we reported descriptive statistics separately due to their low frequency of neurological involvement and poor outcomes. Among the 106,229 hospitalized COVID-19 patients (104,031 patients ≥18 years; 2,198 patients <18 years, January 2020-October 2021), 15,101 (14%) had at least one CNS diagnosis, while 2,788 (3%) had at least one PNS diagnosis. After controlling for demographics and pre-existing conditions, adults with CNS involvement had longer hospital stay (11 versus 6 days) and greater risk of (Hazard Ratio = 1.78) and faster time to death (12 versus 24 days) than patients with no neurological condition (NNC) during acute COVID-19 hospitalization. Adults with PNS involvement also had longer hospital stay but lower risk of mortality than the NNC group. Although children had a low frequency of neurological involvement during COVID-19 hospitalization, a substantially higher proportion of children with CNS involvement died compared to those with NNC (6% vs 1%). Overall, patients with concurrent CNS manifestation during acute COVID-19 hospitalization faced greater risks for adverse clinical outcomes than patients without any neurological diagnosis. Our global informatics framework using a federated approach (versus a centralized data collection approach) has utility for clinical discovery beyond COVID-19.

3.

Potential pitfalls in the use of real-world data for studying long COVID.

Zhang, Harrison G; Honerlaw, Jacqueline P; Maripuri, Monika; Samayamuthu, Malarkodi Jebathilagam; Beaulieu-Jones, Brendin R; Baig, Huma S; L'Yi, Sehi; Ho, Yuk-Lam; Morris, Michele; Panickan, Vidul Ayakulangara; Wang, Xuan; Weber, Griffin M; Liao, Katherine P; Visweswaran, Shyam; Tan, Bryce W Q; Yuan, William; Gehlenborg, Nils; Muralidhar, Sumitra; Ramoni, Rachel B; Kohane, Isaac S; Xia, Zongqi; Cho, Kelly; Cai, Tianxi; Brat, Gabriel A.

Nat Med ; 29(5): 1040-1043, 2023 05.

Article in English | MEDLINE | ID: mdl-37055567

Subject(s)

COVID-19 , Humans , Post-Acute COVID-19 Syndrome , Research

4.

Informative missingness: What can we learn from patterns in missing laboratory data in the electronic health record?

Tan, Amelia L M; Getzen, Emily J; Hutch, Meghan R; Strasser, Zachary H; Gutiérrez-Sacristán, Alba; Le, Trang T; Dagliati, Arianna; Morris, Michele; Hanauer, David A; Moal, Bertrand; Bonzel, Clara-Lea; Yuan, William; Chiudinelli, Lorenzo; Das, Priam; Zhang, Harrison G; Aronow, Bruce J; Avillach, Paul; Brat, Gabriel A; Cai, Tianxi; Hong, Chuan; La Cava, William G; Hooi Will Loh, He; Luo, Yuan; Murphy, Shawn N; Yuan Hgiam, Kee; Omenn, Gilbert S; Patel, Lav P; Jebathilagam Samayamuthu, Malarkodi; Shriver, Emily R; Shakeri Hossein Abad, Zahra; Tan, Byorn W L; Visweswaran, Shyam; Wang, Xuan; Weber, Griffin M; Xia, Zongqi; Verdy, Bertrand; Long, Qi; Mowery, Danielle L; Holmes, John H.

J Biomed Inform ; 139: 104306, 2023 03.

Article in English | MEDLINE | ID: mdl-36738870

ABSTRACT

BACKGROUND: In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as ââreflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients. METHODS: We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern. RESULTS: With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors. CONCLUSION: In this work, we use computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.

Subject(s)

COVID-19 , Electronic Health Records , Humans , Data Collection , Records , Cluster Analysis

5.

Quantifying the Prognostic Value of Preoperative Surgeon Intuition: Comparing Surgeon Intuition and Clinical Risk Prediction as Derived from the American College of Surgeons NSQIP Risk Calculator.

Marwaha, Jayson S; Beaulieu-Jones, Brendin R; Berrigan, Margaret; Yuan, William; Odom, Stephen R; Cook, Charles H; Scott, Benjamin B; Gupta, Alok; Parsons, Charles S; Seshadri, Anupamaa J; Brat, Gabriel A.

J Am Coll Surg ; 236(6): 1093-1103, 2023 06 01.

Article in English | MEDLINE | ID: mdl-36815715

ABSTRACT

BACKGROUND: Surgical risk prediction models traditionally use patient attributes and measures of physiology to generate predictions about postoperative outcomes. However, the surgeon's assessment of the patient may be a valuable predictor, given the surgeon's ability to detect and incorporate factors that existing models cannot capture. We compare the predictive utility of surgeon intuition and a risk calculator derived from the American College of Surgeons (ACS) NSQIP. STUDY DESIGN: From January 10, 2021 to January 9, 2022, surgeons were surveyed immediately before performing surgery to assess their perception of a patient's risk of developing any postoperative complication. Clinical data were abstracted from ACS NSQIP. Both sources of data were independently used to build models to predict the likelihood of a patient experiencing any 30-day postoperative complication as defined by ACS NSQIP. RESULTS: Preoperative surgeon assessment was obtained for 216 patients. NSQIP data were available for 9,182 patients who underwent general surgery (January 1, 2017 to January 9, 2022). A binomial regression model trained on clinical data alone had an area under the receiver operating characteristic curve (AUC) of 0.83 (95% CI 0.80 to 0.85) in predicting any complication. A model trained on only preoperative surgeon intuition had an AUC of 0.70 (95% CI 0.63 to 0.78). A model trained on surgeon intuition and a subset of clinical predictors had an AUC of 0.83 (95% CI 0.77 to 0.89). CONCLUSIONS: Preoperative surgeon intuition alone is an independent predictor of patient outcomes; however, a risk calculator derived from ACS NSQIP is a more robust predictor of postoperative complication. Combining intuition and clinical data did not strengthen prediction.

Subject(s)

Intuition , Surgeons , Humans , United States , Prognosis , Risk Assessment , Postoperative Complications/epidemiology , Postoperative Complications/etiology , Postoperative Complications/diagnosis , Risk Factors , Retrospective Studies , Quality Improvement

6.

Acute respiratory distress syndrome after SARS-CoV-2 infection on young adult population: International observational federated study based on electronic health records through the 4CE consortium.

Moal, Bertrand; Orieux, Arthur; Ferté, Thomas; Neuraz, Antoine; Brat, Gabriel A; Avillach, Paul; Bonzel, Clara-Lea; Cai, Tianxi; Cho, Kelly; Cossin, Sébastien; Griffier, Romain; Hanauer, David A; Haverkamp, Christian; Ho, Yuk-Lam; Hong, Chuan; Hutch, Meghan R; Klann, Jeffrey G; Le, Trang T; Loh, Ne Hooi Will; Luo, Yuan; Makoudjou, Adeline; Morris, Michele; Mowery, Danielle L; Olson, Karen L; Patel, Lav P; Samayamuthu, Malarkodi J; Sanz Vidorreta, Fernando J; Schriver, Emily R; Schubert, Petra; Verdy, Guillaume; Visweswaran, Shyam; Wang, Xuan; Weber, Griffin M; Xia, Zongqi; Yuan, William; Zhang, Harrison G; Zöller, Daniela; Kohane, Isaac S; Boyer, Alexandre; Jouhet, Vianney.

PLoS One ; 18(1): e0266985, 2023.

Article in English | MEDLINE | ID: mdl-36598895

ABSTRACT

PURPOSE: In young adults (18 to 49 years old), investigation of the acute respiratory distress syndrome (ARDS) after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been limited. We evaluated the risk factors and outcomes of ARDS following infection with SARS-CoV-2 in a young adult population. METHODS: A retrospective cohort study was conducted between January 1st, 2020 and February 28th, 2021 using patient-level electronic health records (EHR), across 241 United States hospitals and 43 European hospitals participating in the Consortium for Clinical Characterization of COVID-19 by EHR (4CE). To identify the risk factors associated with ARDS, we compared young patients with and without ARDS through a federated analysis. We further compared the outcomes between young and old patients with ARDS. RESULTS: Among the 75,377 hospitalized patients with positive SARS-CoV-2 PCR, 1001 young adults presented with ARDS (7.8% of young hospitalized adults). Their mortality rate at 90 days was 16.2% and they presented with a similar complication rate for infection than older adults with ARDS. Peptic ulcer disease, paralysis, obesity, congestive heart failure, valvular disease, diabetes, chronic pulmonary disease and liver disease were associated with a higher risk of ARDS. We described a high prevalence of obesity (53%), hypertension (38%- although not significantly associated with ARDS), and diabetes (32%). CONCLUSION: Trough an innovative method, a large international cohort study of young adults developing ARDS after SARS-CoV-2 infection has been gather. It demonstrated the poor outcomes of this population and associated risk factor.

Subject(s)

COVID-19 , Respiratory Distress Syndrome , Humans , Young Adult , Aged , Adolescent , Adult , Middle Aged , COVID-19/complications , COVID-19/epidemiology , SARS-CoV-2 , Cohort Studies , Retrospective Studies , Electronic Health Records , Respiratory Distress Syndrome/etiology , Respiratory Distress Syndrome/complications , Obesity/complications

7.

Trends in Medical Management of Moderately to Severely Active Ulcerative Colitis: A Nationwide Retrospective Analysis.

Yuan, William; Marwaha, Jayson S; Rakowsky, Shana T; Palmer, Nathan P; Kohane, Isaac S; Rubin, David T; Brat, Gabriel A; Feuerstein, Joseph D.

Inflamm Bowel Dis ; 29(5): 695-704, 2023 05 02.

Article in English | MEDLINE | ID: mdl-35786768

ABSTRACT

BACKGROUND: With an increasing number of therapeutic options available for the management of ulcerative colitis (UC), the variability in treatment and prescribing patterns is not well known. While recent guidelines have provided updates on how these therapeutic options should be used, patterns of long-term use of these drugs over the past 2 decades remain unclear. METHODS: We analyzed a retrospective, nationwide cohort of more than 1.7 million prescriptions for trends in prescribing behaviors and to evaluate practices suggested in guidelines relating to ordering biologics, step-up therapy, and combination therapy. The primary outcome was 30-day steroid-free remission and secondary outcomes included hospitalization, cost, and additional steroid usage. A pipeline was created to identify cohorts of patients under active UC medical management grouped by prescribing strategies to evaluate comparative outcomes between strategies. Cox proportional hazards and multivariate regression models were utilized to assess postexposure outcomes and adjust for confounders. RESULTS: Among 6 major drug categories, we noted major baseline differences in patient characteristics at first exposure corresponding to disease activity. We noted earlier use of biologics in patient trajectories (762 days earlier relative to UC diagnosis, 2018 vs 2008; P < .001) and greater overall use of biologics over time (2.53× more in 2018 vs 2008; P < .00001) . Among biologic-naive patients, adalimumab was associated with slightly lower rates of remission compared with infliximab or vedolizumab (odds ratio, 0.92; P < .005). Comparisons of patients with early biologic initiation to patients who transitioned to biologics from 5-aminosalicylic acid suggest lower steroid consumption for early biologic initiation (-761 mg prednisone; P < .001). Combination thiopurine-biologic therapy was associated with higher odds of remission compared with biologic monotherapy (odds ratio, 1.36; P = .01). CONCLUSIONS: As biologic drugs have become increasingly available for UC management, they have increasingly been used at earlier stages of disease management. Large-scale analyses of prescribing behaviors provide evidence supporting early use of biologics compared with step-up therapy and use of thiopurine and biologic combination therapy.

Population-scale analysis reveals patterns in prescribing trends for ulcerative colitis management. Findings include (1) earlier use of biologics in patient trajectories, (2) associations of step-up therapy with higher corticosteroid exposure, and (3) association of combination therapy with positive patient outcomes.

Subject(s)

Biological Products , Colitis, Ulcerative , Humans , Colitis, Ulcerative/drug therapy , Retrospective Studies , Infliximab/therapeutic use , Adalimumab/therapeutic use , Biological Factors/therapeutic use , Immunologic Factors/therapeutic use , Biological Products/therapeutic use

8.

Long-term kidney function recovery and mortality after COVID-19-associated acute kidney injury: An international multi-centre observational cohort study.

Tan, Byorn W L; Tan, Bryce W Q; Tan, Amelia L M; Schriver, Emily R; Gutiérrez-Sacristán, Alba; Das, Priyam; Yuan, William; Hutch, Meghan R; García Barrio, Noelia; Pedrera Jimenez, Miguel; Abu-El-Rub, Noor; Morris, Michele; Moal, Bertrand; Verdy, Guillaume; Cho, Kelly; Ho, Yuk-Lam; Patel, Lav P; Dagliati, Arianna; Neuraz, Antoine; Klann, Jeffrey G; South, Andrew M; Visweswaran, Shyam; Hanauer, David A; Maidlow, Sarah E; Liu, Mei; Mowery, Danielle L; Batugo, Ashley; Makoudjou, Adeline; Tippmann, Patric; Zöller, Daniela; Brat, Gabriel A; Luo, Yuan; Avillach, Paul; Bellazzi, Riccardo; Chiovato, Luca; Malovini, Alberto; Tibollo, Valentina; Samayamuthu, Malarkodi Jebathilagam; Serrano Balazote, Pablo; Xia, Zongqi; Loh, Ne Hooi Will; Chiudinelli, Lorenzo; Bonzel, Clara-Lea; Hong, Chuan; Zhang, Harrison G; Weber, Griffin M; Kohane, Isaac S; Cai, Tianxi; Omenn, Gilbert S; Holmes, John H.

EClinicalMedicine ; 55: 101724, 2023 Jan.

Article in English | MEDLINE | ID: mdl-36381999

ABSTRACT

Background: While acute kidney injury (AKI) is a common complication in COVID-19, data on post-AKI kidney function recovery and the clinical factors associated with poor kidney function recovery is lacking. Methods: A retrospective multi-centre observational cohort study comprising 12,891 hospitalized patients aged 18 years or older with a diagnosis of SARS-CoV-2 infection confirmed by polymerase chain reaction from 1 January 2020 to 10 September 2020, and with at least one serum creatinine value 1-365 days prior to admission. Mortality and serum creatinine values were obtained up to 10 September 2021. Findings: Advanced age (HR 2.77, 95%CI 2.53-3.04, p < 0.0001), severe COVID-19 (HR 2.91, 95%CI 2.03-4.17, p < 0.0001), severe AKI (KDIGO stage 3: HR 4.22, 95%CI 3.55-5.00, p < 0.0001), and ischemic heart disease (HR 1.26, 95%CI 1.14-1.39, p < 0.0001) were associated with worse mortality outcomes. AKI severity (KDIGO stage 3: HR 0.41, 95%CI 0.37-0.46, p < 0.0001) was associated with worse kidney function recovery, whereas remdesivir use (HR 1.34, 95%CI 1.17-1.54, p < 0.0001) was associated with better kidney function recovery. In a subset of patients without chronic kidney disease, advanced age (HR 1.38, 95%CI 1.20-1.58, p < 0.0001), male sex (HR 1.67, 95%CI 1.45-1.93, p < 0.0001), severe AKI (KDIGO stage 3: HR 11.68, 95%CI 9.80-13.91, p < 0.0001), and hypertension (HR 1.22, 95%CI 1.10-1.36, p = 0.0002) were associated with post-AKI kidney function impairment. Furthermore, patients with COVID-19-associated AKI had significant and persistent elevations of baseline serum creatinine 125% or more at 180 days (RR 1.49, 95%CI 1.32-1.67) and 365 days (RR 1.54, 95%CI 1.21-1.96) compared to COVID-19 patients with no AKI. Interpretation: COVID-19-associated AKI was associated with higher mortality, and severe COVID-19-associated AKI was associated with worse long-term post-AKI kidney function recovery. Funding: Authors are supported by various funders, with full details stated in the acknowledgement section.

9.

Changes in laboratory value improvement and mortality rates over the course of the pandemic: an international retrospective cohort study of hospitalised patients infected with SARS-CoV-2.

Hong, Chuan; Zhang, Harrison G; L'Yi, Sehi; Weber, Griffin; Avillach, Paul; Tan, Bryce W Q; Gutiérrez-Sacristán, Alba; Bonzel, Clara-Lea; Palmer, Nathan P; Malovini, Alberto; Tibollo, Valentina; Luo, Yuan; Hutch, Meghan R; Liu, Molei; Bourgeois, Florence; Bellazzi, Riccardo; Chiovato, Luca; Sanz Vidorreta, Fernando J; Le, Trang T; Wang, Xuan; Yuan, William; Neuraz, Antoine; Benoit, Vincent; Moal, Bertrand; Morris, Michele; Hanauer, David A; Maidlow, Sarah; Wagholikar, Kavishwar; Murphy, Shawn; Estiri, Hossein; Makoudjou, Adeline; Tippmann, Patric; Klann, Jeffery; Follett, Robert W; Gehlenborg, Nils; Omenn, Gilbert S; Xia, Zongqi; Dagliati, Arianna; Visweswaran, Shyam; Patel, Lav P; Mowery, Danielle L; Schriver, Emily R; Samayamuthu, Malarkodi Jebathilagam; Kavuluru, Ramakanth; Lozano-Zahonero, Sara; Zöller, Daniela; Tan, Amelia L M; Tan, Byorn W L; Ngiam, Kee Yuan; Holmes, John H.

BMJ Open ; 12(6): e057725, 2022 06 23.

Article in English | MEDLINE | ID: mdl-35738646

ABSTRACT

OBJECTIVE: To assess changes in international mortality rates and laboratory recovery rates during hospitalisation for patients hospitalised with SARS-CoV-2 between the first wave (1 March to 30 June 2020) and the second wave (1 July 2020 to 31 January 2021) of the COVID-19 pandemic. DESIGN, SETTING AND PARTICIPANTS: This is a retrospective cohort study of 83 178 hospitalised patients admitted between 7 days before or 14 days after PCR-confirmed SARS-CoV-2 infection within the Consortium for Clinical Characterization of COVID-19 by Electronic Health Record, an international multihealthcare system collaborative of 288 hospitals in the USA and Europe. The laboratory recovery rates and mortality rates over time were compared between the two waves of the pandemic. PRIMARY AND SECONDARY OUTCOME MEASURES: The primary outcome was all-cause mortality rate within 28 days after hospitalisation stratified by predicted low, medium and high mortality risk at baseline. The secondary outcome was the average rate of change in laboratory values during the first week of hospitalisation. RESULTS: Baseline Charlson Comorbidity Index and laboratory values at admission were not significantly different between the first and second waves. The improvement in laboratory values over time was faster in the second wave compared with the first. The average C reactive protein rate of change was -4.72 mg/dL vs -4.14 mg/dL per day (p=0.05). The mortality rates within each risk category significantly decreased over time, with the most substantial decrease in the high-risk group (42.3% in March-April 2020 vs 30.8% in November 2020 to January 2021, p<0.001) and a moderate decrease in the intermediate-risk group (21.5% in March-April 2020 vs 14.3% in November 2020 to January 2021, p<0.001). CONCLUSIONS: Admission profiles of patients hospitalised with SARS-CoV-2 infection did not differ greatly between the first and second waves of the pandemic, but there were notable differences in laboratory improvement rates during hospitalisation. Mortality risks among patients with similar risk profiles decreased over the course of the pandemic. The improvement in laboratory values and mortality risk was consistent across multiple countries.

Subject(s)

COVID-19 , Pandemics , Hospitalization , Humans , Retrospective Studies , SARS-CoV-2

10.

Distinguishing Admissions Specifically for COVID-19 From Incidental SARS-CoV-2 Admissions: National Retrospective Electronic Health Record Study.

Klann, Jeffrey G; Strasser, Zachary H; Hutch, Meghan R; Kennedy, Chris J; Marwaha, Jayson S; Morris, Michele; Samayamuthu, Malarkodi Jebathilagam; Pfaff, Ashley C; Estiri, Hossein; South, Andrew M; Weber, Griffin M; Yuan, William; Avillach, Paul; Wagholikar, Kavishwar B; Luo, Yuan; Omenn, Gilbert S; Visweswaran, Shyam; Holmes, John H; Xia, Zongqi; Brat, Gabriel A; Murphy, Shawn N.

J Med Internet Res ; 24(5): e37931, 2022 05 18.

Article in English | MEDLINE | ID: mdl-35476727

ABSTRACT

BACKGROUND: Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. Electronic health record (EHR)-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. Although the need to improve classification of COVID-19 versus incidental SARS-CoV-2 is well understood, the magnitude of the problems has only been characterized in small, single-center studies. Furthermore, there have been no peer-reviewed studies evaluating methods for improving classification. OBJECTIVE: The aims of this study are to, first, quantify the frequency of incidental hospitalizations over the first 15 months of the pandemic in multiple hospital systems in the United States and, second, to apply electronic phenotyping techniques to automatically improve COVID-19 hospitalization classification. METHODS: From a retrospective EHR-based cohort in 4 US health care systems in Massachusetts, Pennsylvania, and Illinois, a random sample of 1123 SARS-CoV-2 PCR-positive patients hospitalized from March 2020 to August 2021 was manually chart-reviewed and classified as "admitted with COVID-19" (incidental) versus specifically admitted for COVID-19 ("for COVID-19"). EHR-based phenotyping was used to find feature sets to filter out incidental admissions. RESULTS: EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in an average of 26% of hospitalizations (although this varied widely over time, from 0% to 75%). The top site-specific feature sets had 79%-99% specificity with 62%-75% sensitivity, while the best-performing across-site feature sets had 71%-94% specificity with 69%-81% sensitivity. CONCLUSIONS: A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.

Subject(s)

COVID-19 , SARS-CoV-2 , COVID-19/diagnosis , COVID-19/epidemiology , Electronic Health Records , Hospitalization , Humans , Retrospective Studies

11.

Distinguishing Admissions Specifically for COVID-19 from Incidental SARS-CoV-2 Admissions: A National EHR Research Consortium Study.

Klann, Jeffrey G; Strasser, Zachary H; Hutch, Meghan R; Kennedy, Chris J; Marwaha, Jayson S; Morris, Michele; Samayamuthu, Malarkodi Jebathilagam; Pfaff, Ashley C; Estiri, Hossein; South, Andrew M; Weber, Griffin M; Yuan, William; Avillach, Paul; Wagholikar, Kavishwar B; Luo, Yuan; Omenn, Gilbert S; Visweswaran, Shyam; Holmes, John H; Xia, Zongqi; Brat, Gabriel A; Murphy, Shawn N.

medRxiv ; 2022 Feb 18.

Article in English | MEDLINE | ID: mdl-35350202

ABSTRACT

Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. EHR-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. From a retrospective EHR-based cohort in four US healthcare systems, a random sample of 1,123 SARS-CoV-2 PCR-positive patients hospitalized between 3/2020â"8/2021 was manually chart-reviewed and classified as admitted-with-COVID-19 (incidental) vs. specifically admitted for COVID-19 (for-COVID-19). EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in 26%. The top site-specific feature sets had 79-99% specificity with 62-75% sensitivity, while the best performing across-site feature set had 71-94% specificity with 69-81% sensitivity. A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.

12.

Authorship Correction: International Changes in COVID-19 Clinical Trajectories Across 315 Hospitals and 6 Countries: Retrospective Cohort Study.

Weber, Griffin M; Zhang, Harrison G; L'Yi, Sehi; Bonzel, Clara-Lea; Hong, Chuan; Avillach, Paul; Gutiérrez-Sacristán, Alba; Palmer, Nathan P; Tan, Amelia Li Min; Wang, Xuan; Yuan, William; Gehlenborg, Nils; Alloni, Anna; Amendola, Danilo F; Bellasi, Antonio; Bellazzi, Riccardo; Beraghi, Michele; Bucalo, Mauro; Chiovato, Luca; Cho, Kelly; Dagliati, Arianna; Estiri, Hossein; Follett, Robert W; García Barrio, Noelia; Hanauer, David A; Henderson, Darren W; Ho, Yuk-Lam; Holmes, John H; Hutch, Meghan R; Kavuluru, Ramakanth; Kirchoff, Katie; Klann, Jeffrey G; Krishnamurthy, Ashok K; Le, Trang T; Liu, Molei; Loh, Ne Hooi Will; Lozano-Zahonero, Sara; Luo, Yuan; Maidlow, Sarah; Makoudjou, Adeline; Malovini, Alberto; Martins, Marcelo Roberto; Moal, Bertrand; Morris, Michele; Mowery, Danielle L; Murphy, Shawn N; Neuraz, Antoine; Ngiam, Kee Yuan; Okoshi, Marina P; Omenn, Gilbert S.

J Med Internet Res ; 23(11): e34625, 2021 Nov 30.

Article in English | MEDLINE | ID: mdl-34889759

ABSTRACT

[This corrects the article DOI: 10.2196/31400.].

13.

Integrative multiomics-histopathology analysis for breast cancer classification.

Ektefaie, Yasha; Yuan, William; Dillon, Deborah A; Lin, Nancy U; Golden, Jeffrey A; Kohane, Isaac S; Yu, Kun-Hsing.

NPJ Breast Cancer ; 7(1): 147, 2021 Nov 29.

Article in English | MEDLINE | ID: mdl-34845230

ABSTRACT

Histopathologic evaluation of biopsy slides is a critical step in diagnosing and subtyping breast cancers. However, the connections between histology and multi-omics status have never been systematically explored or interpreted. We developed weakly supervised deep learning models over hematoxylin-and-eosin-stained slides to examine the relations between visual morphological signal, clinical subtyping, gene expression, and mutation status in breast cancer. We first designed fully automated models for tumor detection and pathology subtype classification, with the results validated in independent cohorts (area under the receiver operating characteristic curve ≥ 0.950). Using only visual information, our models achieved strong predictive performance in estrogen/progesterone/HER2 receptor status, PAM50 status, and TP53 mutation status. We demonstrated that these models learned lymphocyte-specific morphological signals to identify estrogen receptor status. Examination of the PAM50 cohort revealed a subset of PAM50 genes whose expression reflects cancer morphology. This work demonstrates the utility of deep learning-based image models in both clinical and research regimes, through its ability to uncover connections between visual morphology and genetic statuses.

14.

International Changes in COVID-19 Clinical Trajectories Across 315 Hospitals and 6 Countries: Retrospective Cohort Study.

Weber, Griffin M; Zhang, Harrison G; L'Yi, Sehi; Bonzel, Clara-Lea; Hong, Chuan; Avillach, Paul; Gutiérrez-Sacristán, Alba; Palmer, Nathan P; Tan, Amelia Li Min; Wang, Xuan; Yuan, William; Gehlenborg, Nils; Alloni, Anna; Amendola, Danilo F; Bellasi, Antonio; Bellazzi, Riccardo; Beraghi, Michele; Bucalo, Mauro; Chiovato, Luca; Cho, Kelly; Dagliati, Arianna; Estiri, Hossein; Follett, Robert W; García Barrio, Noelia; Hanauer, David A; Henderson, Darren W; Ho, Yuk-Lam; Holmes, John H; Hutch, Meghan R; Kavuluru, Ramakanth; Kirchoff, Katie; Klann, Jeffrey G; Krishnamurthy, Ashok K; Le, Trang T; Liu, Molei; Loh, Ne Hooi Will; Lozano-Zahonero, Sara; Luo, Yuan; Maidlow, Sarah; Makoudjou, Adeline; Malovini, Alberto; Martins, Marcelo Roberto; Moal, Bertrand; Morris, Michele; Mowery, Danielle L; Murphy, Shawn N; Neuraz, Antoine; Ngiam, Kee Yuan; Okoshi, Marina P; Omenn, Gilbert S.

J Med Internet Res ; 23(10): e31400, 2021 10 11.

Article in English | MEDLINE | ID: mdl-34533459

ABSTRACT

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.

Subject(s)

COVID-19 , Pandemics , Adult , Aged , Female , Hospitalization , Hospitals , Humans , Male , Middle Aged , Retrospective Studies , SARS-CoV-2

15.

Illustrating potential effects of alternate control populations on real-world evidence-based statistical analyses.

Huang, Yidi; Yuan, William; Kohane, Isaac S; Beaulieu-Jones, Brett K.

JAMIA Open ; 4(2): ooab045, 2021 Apr.

Article in English | MEDLINE | ID: mdl-34142018

ABSTRACT

OBJECTIVE: Case-control study designs are commonly used in retrospective analyses of real-world evidence (RWE). Due to the increasingly wide availability of RWE, it can be difficult to determine whether findings are robust or the result of testing multiple hypotheses. MATERIALS AND METHODS: We investigate the potential effects of modifying cohort definitions in a case-control association study between depression and type 2 diabetes mellitus. We used a large (>75 million individuals) de-identified administrative claims database to observe the effects of minor changes to the requirements of glucose and hemoglobin A1c tests in the control group. RESULTS: We found that small permutations to the criteria used to define the control population result in significant shifts in both the demographic structure of the identified cohort as well as the odds ratio of association. These differences remain present when testing against age- and sex-matched controls. DISCUSSION: Analyses of RWE need to be carefully designed to avoid issues of multiple testing. Minor changes to control cohorts can lead to significantly different results and have the potential to alter even prospective studies through selection bias. CONCLUSION: We believe this work offers strong support for the need for robust guidelines, best practices, and regulations around the use of observational RWE for clinical or regulatory decision-making.

16.

Accelerating diagnosis of Parkinson's disease through risk prediction.

Yuan, William; Beaulieu-Jones, Brett; Krolewski, Richard; Palmer, Nathan; Veyrat-Follet, Christine; Frau, Francesca; Cohen, Caroline; Bozzi, Sylvie; Cogswell, Meaghan; Kumar, Dinesh; Coulouvrat, Catherine; Leroy, Bruno; Fischer, Tanya Z; Sardi, S Pablo; Chandross, Karen J; Rubin, Lee L; Wills, Anne-Marie; Kohane, Isaac; Lipnick, Scott L.

BMC Neurol ; 21(1): 201, 2021 May 18.

Article in English | MEDLINE | ID: mdl-34006233

ABSTRACT

BACKGROUND: Characterization of prediagnostic Parkinson's Disease (PD) and early prediction of subsequent development are critical for preventive interventions, risk stratification and understanding of disease pathology. This study aims to characterize the role of the prediagnostic period in PD and, using selected features from this period as novel interception points, construct a prediction model to accelerate the diagnosis in a real-world setting. METHODS: We constructed two sets of machine learning models: a retrospective approach highlighting exposures up to 5 years prior to PD diagnosis, and an alternative model that prospectively predicted future PD diagnosis from all individuals at their first diagnosis of a gait or tremor disorder, these being features that appeared to represent the initiation of a differential diagnostic window. RESULTS: We found many novel features captured by the retrospective models; however, the high accuracy was primarily driven from surrogate diagnoses for PD, such as gait and tremor disorders, suggesting the presence of a distinctive differential diagnostic period when the clinician already suspected PD. The model utilizing a gait/tremor diagnosis as the interception point, achieved a validation AUC of 0.874 with potential time compression to a future PD diagnosis of more than 300 days. Comparisons of predictive diagnoses between the prospective and prediagnostic cohorts suggest the presence of distinctive trajectories of PD progression based on comorbidity profiles. CONCLUSIONS: Overall, our machine learning approach allows for both guiding clinical decisions such as the initiation of neuroprotective interventions and importantly, the possibility of earlier diagnosis for clinical trials for disease modifying therapies.

Subject(s)

Parkinson Disease/diagnosis , Gait/physiology , Gait Analysis , Humans , Machine Learning , Retrospective Studies , Risk Assessment , Tremor

17.

Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

Beaulieu-Jones, Brett K; Yuan, William; Brat, Gabriel A; Beam, Andrew L; Weber, Griffin; Ruffin, Marshall; Kohane, Isaac S.

NPJ Digit Med ; 4(1): 62, 2021 Mar 30.

Article in English | MEDLINE | ID: mdl-33785839

ABSTRACT

Machine learning can help clinicians to make individualized patient predictions only if researchers demonstrate models that contribute novel insights, rather than learning the most likely next step in a set of actions a clinician will take. We trained deep learning models using only clinician-initiated, administrative data for 42.9 million admissions using three subsets of data: demographic data only, demographic data and information available at admission, and the previous data plus charges recorded during the first day of admission. Models trained on charges during the first day of admission achieve performance close to published full EMR-based benchmarks for inpatient outcomes: inhospital mortality (0.89 AUC), prolonged length of stay (0.82 AUC), and 30-day readmission rate (0.71 AUC). Similar performance between models trained with only clinician-initiated data and those trained with full EMR data purporting to include information about patient state and physiology should raise concern in the deployment of these models. Furthermore, these models exhibited significant declines in performance when evaluated over only myocardial infarction (MI) patients relative to models trained over MI patients alone, highlighting the importance of physician diagnosis in the prognostic performance of these models. These results provide a benchmark for predictive accuracy trained only on prior clinical actions and indicate that models with similar performance may derive their signal by looking over clinician's shoulders-using clinical behavior as the expression of preexisting intuition and suspicion to generate a prediction. For models to guide clinicians in individual decisions, performance exceeding these benchmarks is necessary.

18.

Temporal bias in case-control design: preventing reliable predictions of the future.

Yuan, William; Beaulieu-Jones, Brett K; Yu, Kun-Hsing; Lipnick, Scott L; Palmer, Nathan; Loscalzo, Joseph; Cai, Tianxi; Kohane, Isaac S.

Nat Commun ; 12(1): 1107, 2021 02 17.

Article in English | MEDLINE | ID: mdl-33597541

ABSTRACT

One of the primary tools that researchers use to predict risk is the case-control study. We identify a flaw, temporal bias, that is specific to and uniquely associated with these studies that occurs when the study period is not representative of the data that clinicians have during the diagnostic process. Temporal bias acts to undermine the validity of predictions by over-emphasizing features close to the outcome of interest. We examine the impact of temporal bias across the medical literature, and highlight examples of exaggerated effect sizes, false-negative predictions, and replication failure. Given the ubiquity and practical advantages of case-control studies, we discuss strategies for estimating the influence of and preventing temporal bias where it exists.

Subject(s)

Biomedical Research/standards , Clinical Trials as Topic/standards , Patient Selection , Research Design/standards , Bias , Biomedical Research/methods , Biomedical Research/trends , Case-Control Studies , Clinical Trials as Topic/methods , Forecasting , Humans , Reproducibility of Results

19.

Development of a colorimetric α-ketoglutarate detection assay for prolyl hydroxylase domain (PHD) proteins.

Wong, Samantha J; Ringel, Alison E; Yuan, William; Paulo, Joao A; Yoon, Haejin; Currie, Mark A; Haigis, Marcia C.

J Biol Chem ; 296: 100397, 2021.

Article in English | MEDLINE | ID: mdl-33571527

ABSTRACT

Since the discovery of the prolyl hydroxylases domain (PHD) proteins and their canonical hypoxia-inducible factor (HIF) substrate two decades ago, a number of in vitro hydroxylation (IVH) assays for PHD activity have been developed to measure the PHD-HIF interaction. However, most of these assays either require complex proteomics mass spectrometry methods that rely on the specific PHD-HIF interaction or require the handling of radioactive material, as seen in the most commonly used assay measuring [14C]O2 release from labeled [14C]α-ketoglutarate. Here, we report an alternative rapid, cost-effective assay in which the consumption of α-ketoglutarate is monitored by its derivatization with 2,4-dinitrophenylhydrazine (2,4-DNPH) followed by treatment with concentrated base. We extensively optimized this 2,4-DNPH α-ketoglutarate assay to maximize the signal-to-noise ratio and demonstrated that it is robust enough to obtain kinetic parameters of the well-characterized PHD2 isoform comparable with those in published literature. We further showed that it is also sensitive enough to detect and measure the IC50 values of pan-PHD inhibitors and several PHD2 inhibitors in clinical trials for chronic kidney disease (CKD)-induced anemia. Given the efficiency of this assay coupled with its multiwell format, the 2,4-DNPH α-KG assay may be adaptable to explore non-HIF substrates of PHDs and potentially to high-throughput assays.

Subject(s)

Colorimetry/methods , Hypoxia-Inducible Factor-Proline Dioxygenases/analysis , Ketoglutaric Acids/analysis , Phenylhydrazines/chemistry , Enzyme Assays/methods , Humans , Hydroxylation , Hypoxia-Inducible Factor 1, alpha Subunit/metabolism , Hypoxia-Inducible Factor-Proline Dioxygenases/metabolism , Ketoglutaric Acids/chemistry , Kinetics , Substrate Specificity

20.

Examining the Use of Real-World Evidence in the Regulatory Process.

Beaulieu-Jones, Brett K; Finlayson, Samuel G; Yuan, William; Altman, Russ B; Kohane, Isaac S; Prasad, Vinay; Yu, Kun-Hsing.

Clin Pharmacol Ther ; 107(4): 843-852, 2020 04.

Article in English | MEDLINE | ID: mdl-31562770

ABSTRACT

The 21st Century Cures Act passed by the United States Congress mandates the US Food and Drug Administration to develop guidance to evaluate the use of real-world evidence (RWE) to support the regulatory process. RWE has generated important medical discoveries, especially in areas where traditional clinical trials would be unethical or infeasible. However, RWE suffers from several issues that hinder its ability to provide proof of treatment efficacy at a level comparable to randomized controlled trials. In this review article, we summarized the advantages and limitations of RWE, identified the key opportunities for RWE, and pointed the way forward to maximize the potential of RWE for regulatory purposes.

Subject(s)

Clinical Trials as Topic/legislation & jurisprudence , Evidence-Based Medicine/legislation & jurisprudence , United States Food and Drug Administration/legislation & jurisprudence , Clinical Trials as Topic/methods , Clinical Trials as Topic/statistics & numerical data , Decision Making , Evidence-Based Medicine/methods , Evidence-Based Medicine/statistics & numerical data , Humans , United States

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL