Supervised Pretraining through Contrastive Categorical Positive Samplings to Improve COVID-19 Mortality Prediction.
ACM BCB
; 20222022 Aug.
Article
in English
| MEDLINE | ID: covidwho-1993099
ABSTRACT
Clinical EHR data is naturally heterogeneous, where it contains abundant sub-phenotype. Such diversity creates challenges for outcome prediction using a machine learning model since it leads to high intra-class variance. To address this issue, we propose a supervised pre-training model with a unique embedded k-nearest-neighbor positive sampling strategy. We demonstrate the enhanced performance value of this framework theoretically and show that it yields highly competitive experimental results in predicting patient mortality in real-world COVID-19 EHR data with a total of over 7,000 patients admitted to a large, urban health system. Our method achieves a better AUROC prediction score of 0.872, which outperforms the alternative pre-training models and traditional machine learning methods. Additionally, our method performs much better when the training data size is small (345 training instances).
Full text:
Available
Collection:
International databases
Database:
MEDLINE
Type of study:
Prognostic study
Language:
English
Year:
2022
Document Type:
Article
Affiliation country:
3535508.3545541
Similar
MEDLINE
...
LILACS
LIS