Search | VHL Regional Portal

NeoGx: Machine-Recommended Rapid Genome Sequencing for Neonates.

Antoniou, Austin A; McGinley, Regan; Metzler, Marina; Chaudhari, Bimal P.

medRxiv ; 2024 Jun 25.

Article in English | MEDLINE | ID: mdl-38978650

ABSTRACT

Background: Genetic disease is common in the Level IV Neonatal Intensive Care Unit (NICU), but neonatology providers are not always able to identify the need for genetic evaluation. We trained a machine learning (ML) algorithm to predict the need for genetic testing within the first 18 months of life using health record phenotypes. Methods: For a decade of NICU patients, we extracted Human Phenotype Ontology (HPO) terms from clinical text with Natural Language Processing tools. Considering multiple feature sets, classifier architectures, and hyperparameters, we selected a classifier and made predictions on a validation cohort of 2,241 Level IV NICU admits born 2020-2021. Results: Our classifier had ROC AUC of 0.87 and PR AUC of 0.73 when making predictions during the first week in the Level IV NICU. We simulated testing policies under which subjects begin testing at the time of first ML prediction, estimating diagnostic odyssey length both with and without the additional benefit of pursuing rGS at this time. Just by using ML to accelerate initial genetic testing (without changing the tests ordered), the median time to first genetic test dropped from 10 days to 1 day, and the number of diagnostic odysseys resolved within 14 days of NICU admission increased by a factor of 1.8. By additionally requiring rGS at the time of positive ML prediction, the number of diagnostic odysseys resolved within 14 days was 3.8 times higher than the baseline. Conclusions: ML predictions of genetic testing need, together with the application of the right rapid testing modality, can help providers accelerate genetics evaluation and bring about earlier and better outcomes for patients.

CNVoyant: A Highly Performant and Explainable Multi-Classifier Machine Learning Approach for Determining the Clinical Significance of Copy Number Variants.

Schuetz, Robert J; Ceyhan, Defne; Antoniou, Austin A; Chaudhari, Bimal P; White, Peter.

Res Sq ; 2024 Apr 30.

Article in English | MEDLINE | ID: mdl-38746157

ABSTRACT

The precise classification of copy number variants (CNVs) presents a significant challenge in genomic medicine, primarily due to the complex nature of CNVs and their diverse impact on genetic disorders. This complexity is compounded by the limitations of existing methods in accurately distinguishing between benign, uncertain, and pathogenic CNVs. Addressing this gap, we introduce CNVoyant, a machine learning-based multi-class framework designed to enhance the clinical significance classification of CNVs. Trained on a comprehensive dataset of 52,176 ClinVar entries across pathogenic, uncertain, and benign classifications, CNVoyant incorporates a broad spectrum of genomic features, including genome position, disease-gene annotations, dosage sensitivity, and conservation scores. Models to predict the clinical significance of copy number gains and losses were trained independently. Final models were selected after testing 29 machine learning architectures and 10,000 hyperparameter combinations each for deletions and duplications via 5-fold cross-validation. We validate the performance of the CNVoyant by leveraging a comprehensive set of 21,574 CNVs from the DECIPHER database, a highly regarded resource known for its extensive catalog of chromosomal imbalances linked to clinical outcomes. Compared to alternative approaches, CNVoyant shows marked improvements in precision-recall and ROC AUC metrics for binary pathogenic classifications while going one step further, offering multi-classification of clinical significance and corresponding SHAP explainability plots. This large-scale validation demonstrates CNVoyant's superior accuracy and underscores its potential to aid genomic researchers and clinical geneticists in interpreting the clinical implications of real CNVs.

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL