Metaviromic identification of discriminative genomic features in SARS-CoV-2 using machine learning.
Patterns (N Y)
; 3(2): 100407, 2022 Feb 11.
Article
in English
| MEDLINE | ID: covidwho-1521457
ABSTRACT
The COVID-19 pandemic caused by SARS-CoV-2 has become a major threat across the globe. Here, we developed machine learning approaches to identify key pathogenic regions in coronavirus genomes. We trained and evaluated 7,562,625 models on 3,665 genomes including SARS-CoV-2, MERS-CoV, SARS-CoV, and other coronaviruses of human and animal origins to return quantitative and biologically interpretable signatures at nucleotide and amino acid resolutions. We identified hotspots across the SARS-CoV-2 genome, including previously unappreciated features in spike, RdRp, and other proteins. Finally, we integrated pathogenicity genomic profiles with B cell and T cell epitope predictions for enrichment of sequence targets to help guide vaccine development. These results provide a systematic map of predicted pathogenicity in SARS-CoV-2 that incorporates sequence, structural, and immunologic features, providing an unbiased collection of genetic elements for functional studies. This metavirome-based framework can also be applied for rapid characterization of new coronavirus strains or emerging pathogenic viruses.
Full text:
Available
Collection:
International databases
Database:
MEDLINE
Type of study:
Experimental Studies
/
Prognostic study
/
Systematic review/Meta Analysis
Topics:
Vaccines
Language:
English
Journal:
Patterns (N Y)
Year:
2022
Document Type:
Article
Affiliation country:
J.patter.2021.100407
Similar
MEDLINE
...
LILACS
LIS