Your browser doesn't support javascript.
De-black-boxing health AI: demonstrating reproducible machine learning computable phenotypes using the N3C-RECOVER Long COVID model in the All of Us data repository.
Pfaff, Emily R; Girvin, Andrew T; Crosskey, Miles; Gangireddy, Srushti; Master, Hiral; Wei, Wei-Qi; Kerchberger, V Eric; Weiner, Mark; Harris, Paul A; Basford, Melissa; Lunt, Chris; Chute, Christopher G; Moffitt, Richard A; Haendel, Melissa.
  • Pfaff ER; Department of Medicine, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, North Carolina, USA.
  • Girvin AT; Palantir Technologies, Denver, Colorado, USA.
  • Crosskey M; CoVar Applied Technologies, Durham, North Carolina, USA.
  • Gangireddy S; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
  • Master H; Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
  • Wei WQ; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
  • Kerchberger VE; Department of Medicine, Division of Allergy, Pulmonary & Critical Care Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
  • Weiner M; Department of Medicine, Weill Cornell Medicine, New York, USA.
  • Harris PA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
  • Basford M; Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
  • Lunt C; National Institutes of Health, Bethesda, Maryland, USA.
  • Chute CG; Johns Hopkins Schools of Medicine, Public Health, and Nursing. Baltimore, Maryland, USA.
  • Moffitt RA; Departments of Hematology and Medical Oncology and Biomedical Informatics, Emory University, Atlanta, Georgia, USA.
  • Haendel M; Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Denver, Colorado, USA.
J Am Med Inform Assoc ; 30(7): 1305-1312, 2023 06 20.
Article in English | MEDLINE | ID: covidwho-2325541
ABSTRACT
Machine learning (ML)-driven computable phenotypes are among the most challenging to share and reproduce. Despite this difficulty, the urgent public health considerations around Long COVID make it especially important to ensure the rigor and reproducibility of Long COVID phenotyping algorithms such that they can be made available to a broad audience of researchers. As part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, researchers with the National COVID Cohort Collaborative (N3C) devised and trained an ML-based phenotype to identify patients highly probable to have Long COVID. Supported by RECOVER, N3C and NIH's All of Us study partnered to reproduce the output of N3C's trained model in the All of Us data enclave, demonstrating model extensibility in multiple environments. This case study in ML-based phenotype reuse illustrates how open-source software best practices and cross-site collaboration can de-black-box phenotyping algorithms, prevent unnecessary rework, and promote open science in informatics.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Boxing / Population Health / COVID-19 Type of study: Cohort study / Observational study / Prognostic study / Randomized controlled trials Topics: Long Covid Limits: Humans Language: English Journal: J Am Med Inform Assoc Journal subject: Medical Informatics Year: 2023 Document Type: Article Affiliation country: Jamia

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Boxing / Population Health / COVID-19 Type of study: Cohort study / Observational study / Prognostic study / Randomized controlled trials Topics: Long Covid Limits: Humans Language: English Journal: J Am Med Inform Assoc Journal subject: Medical Informatics Year: 2023 Document Type: Article Affiliation country: Jamia