CovRNN - A recurrent neural network model for predicting outcomes of COVID-19 patients: model development and validation using EHR data

Laila Rasmy; Masayuki Nigo; Bijun Sai Kannadath; Ziqian Xie; Bingyu Mao; Khush Patel; Yujia Zhou; Wanheng Zhang; Angela M. Ross; Hua Xu; Degui Zhi

This article is a Preprint

Preprints are preliminary research reports that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Preprints posted online allow authors to receive rapid feedback and the entire scientific community can appraise the work for themselves and respond appropriately. Those comments are posted alongside the preprints for anyone to read them and serve as a post publication assessment.

CovRNN - A recurrent neural network model for predicting outcomes of COVID-19 patients: model development and validation using EHR data

Laila Rasmy; Masayuki Nigo; Bijun Sai Kannadath; Ziqian Xie; Bingyu Mao; Khush Patel; Yujia Zhou; Wanheng Zhang; Angela M. Ross; Hua Xu; Degui Zhi.

Affiliation

Laila Rasmy; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Masayuki Nigo; McGovern Medical School, University of Texas Health Science Center at Houston
Bijun Sai Kannadath; College of Medicine, University of Arizona - Phoenix
Ziqian Xie; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Bingyu Mao; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Khush Patel; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Yujia Zhou; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Wanheng Zhang; School of Public Health, University of Texas Health Science Center at Houston
Angela M. Ross; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Hua Xu; School of Biomedical Informatics, University of Texas Health Science Center at Houston
Degui Zhi; School of Biomedical Informatics, University of Texas Health Science Center at Houston

Preprint in English | medRxiv | ID: ppmedrxiv-21264121

ABSTRACT

ABSTRACT

BackgroundPredicting outcomes of COVID-19 patients at an early stage is critical for optimized clinical care and resource management, especially during a pandemic. Although multiple machine learning models have been proposed to address this issue, based on the need for extensive data pre-processing and feature engineering, these models have not been validated or implemented outside of the original study site. MethodsIn this study, we propose CovRNN, recurrent neural network (RNN)-based models to predict COVID-19 patients outcomes, using their available electronic health record (EHR) data on admission, without the need for specific feature selection or missing data imputation. CovRNN is designed to predict three

outcomes:

in-hospital mortality, need for mechanical ventilation, and long length of stay (LOS >7 days). Predictions are made for time-to-event risk scores (survival prediction) and all-time risk scores (binary prediction). Our models were trained and validated using heterogeneous and de-identified data of 247,960 COVID-19 patients from 87 healthcare systems, derived from the Cerner(R) Real-World Dataset (CRWD). External validation was performed using three test sets (approximately 53,000 patients). Further, the transferability of CovRNN was validated using 36,140 de-identified patients data derived from the Optum(R) de-identified COVID-19 Electronic Health Record v. 1015 dataset (2007-2020). FindingsCovRNN shows higher performance than do traditional models. It achieved an area under the receiving operating characteristic (AUROC) of 93% for mortality and mechanical ventilation predictions on the CRWD test set (vs. 91{middle dot}5% and 90% for light gradient boost machine (LGBM) and logistic regression (LR), respectively) and 86.5% for prediction of LOS > 7 days (vs. 81{middle dot}7% and 80% for LGBM and LR, respectively). For survival prediction, CovRNN achieved a C-index of 86% for mortality and 92{middle dot}6% for mechanical ventilation. External validation confirmed AUROCs in similar ranges. InterpretationTrained on a large heterogeneous real-world dataset, our CovRNN model showed high prediction accuracy, good calibration, and transferability through consistently good performance on multiple external datasets. Our results demonstrate the feasibility of a COVID-19 predictive model that delivers high accuracy without the need for complex feature engineering.

License

cc_by_nc_nd

Fulltext

Add to My VHL

XML

Search on Google

Full text: Available Collection: Preprints Database: medRxiv Type of study: Prognostic study Language: English Year: 2021 Document type: Preprint

Fulltext

Add to My VHL

XML

Search on Google

Full text: Available Collection: Preprints Database: medRxiv Type of study: Prognostic study Language: English Year: 2021 Document type: Preprint