Understanding the mutational frequency in SARS-CoV-2 proteome using structural features.
Comput Biol Med
; 147: 105708, 2022 08.
Article
in English
| MEDLINE | ID: covidwho-1944684
ABSTRACT
The prolonged transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus in the human population has led to demographic divergence and the emergence of several location-specific clusters of viral strains. Although the effect of mutation(s) on severity and survival of the virus is still unclear, it is evident that certain sites in the viral proteome are more/less prone to mutations. In fact, millions of SARS-CoV-2 sequences collected all over the world have provided us a unique opportunity to understand viral protein mutations and develop novel computational approaches to predict mutational patterns. In this study, we have classified the mutation sites into low and high mutability classes based on viral isolates count containing mutations. The physicochemical features and structural analysis of the SARS-CoV-2 proteins showed that features including residue type, surface accessibility, residue bulkiness, stability and sequence conservation at the mutation site were able to classify the low and high mutability sites. We further developed machine learning models using above-mentioned features, to predict low and high mutability sites at different selection thresholds (ranging 5-30% of topmost and bottommost mutated sites) and observed the improvement in performance as the selection threshold is reduced (prediction accuracy ranging from 65 to 77%). The analysis will be useful for early detection of variants of concern for the SARS-CoV-2, which can also be applied to other existing and emerging viruses for another pandemic prevention.
Keywords
Full text:
Available
Collection:
International databases
Database:
MEDLINE
Main subject:
SARS-CoV-2
/
COVID-19
Type of study:
Prognostic study
Topics:
Variants
Limits:
Humans
Language:
English
Journal:
Comput Biol Med
Year:
2022
Document Type:
Article
Similar
MEDLINE
...
LILACS
LIS