Your browser doesn't support javascript.
Forecasting dominance of SARS-CoV-2 lineages by anomaly detection using deep AutoEncoders (preprint)
biorxiv; 2023.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.10.24.563721
ABSTRACT
The COVID-19 pandemic exemplified the need for a rapid, effective genomic-based surveillance system to predict emerging SARS-CoV-2 variants and lineages. Traditional molecular epidemiology methods, which leverage public health surveillance or integrated sequence data repositories, are able to characterize the evolutionary history of infection waves and genetic evolution but fall short in predicting future outlooks in promptly anticipating viral genetic alterations. To bridge this gap, we introduce a novel Deep learning, autoencoder-based method for anomaly detection in SARS-CoV-2 (DeepAutoCov). Trained and updated on the public global SARS-CoV-2 GISAID database. DeepAutoCov identifies Future Dominant Lineages (FDLs), defined as lineages comprising at least 25% of SARS-CoV-2 genomes added on a given week, on a weekly basis, using the Spike (S) protein. Our algorithm is grounded on anomaly detection via an unsupervised approach, which is necessary given that FDLs can be known only a posteriori (i.e., after they have become dominant). We developed two concurrent approaches (a linear unsupervised and a posteriori supervised) to evaluate DeepAutoCoV performance. DeepAutoCoV identifies FDL, using the spike (S) protein, with a median lead time of 31 weeks on global data and achieves a positive predictive value ~7x better and 23% higher than the other approaches. Furthermore, it predicts vaccine related FDLs up to 17 months in advance. Finally, DeepAutoCoV is not only predictive but also interpretable, since it can pinpoint specific mutations within FDLs, generating hypotheses on the potential increases in virulence or transmissibility of a lineage. By integrating genomic surveillance with artificial intelligence, our work marks a transformative step that may provide valuable insights for the optimization of public health prevention and intervention strategies.
Subject(s)

Full text: Available Collection: Preprints Database: bioRxiv Main subject: Abnormalities, Drug-Induced / COVID-19 Language: English Year: 2023 Document Type: Preprint

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: Preprints Database: bioRxiv Main subject: Abnormalities, Drug-Induced / COVID-19 Language: English Year: 2023 Document Type: Preprint