This article is a Preprint
Preprints are preliminary research reports that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Preprints posted online allow authors to receive rapid feedback and the entire scientific community can appraise the work for themselves and respond appropriately. Those comments are posted alongside the preprints for anyone to read them and serve as a post publication assessment.
Motif Analysis in k-mer Networks: An Approach towards Understanding SARS-CoV-2 Geographical Shifts (preprint)
biorxiv; 2020.
Preprint
in English
| bioRxiv | ID: ppzbmed-10.1101.2020.10.04.325662
ABSTRACT
With an increasing number of SARS-CoV-2 sequences available day by day, new genomic information is getting revealed to us. As SARS-CoV-2 sequences highlight wide changes across the samples, we aim to explore whether these changes reveal the geographical origin of the corresponding samples. The k-mer distributions, denoting normalized frequency counts of all possible combinations of nucleotide of size upto k, are often helpful to explore sequence level patterns. Given the SARS-CoV-2 sequences are highly imbalanced by its geographical origin (relatively with a higher number samples collected from the USA), we observe that with proper under-sampling k-mer distributions in the SARS-CoV-2 sequences predict its geographical origin with more than 90% accuracy. The experiments are performed on the samples collected from six countries with maximum number of sequences available till July 07, 2020. This comprises SARS-CoV-2 sequences from Australia, USA, China, India, Greece and France. Moreover, we demonstrate that the changes of genomic sequences characterize the continents as a whole. We also highlight that the network motifs present in the sequence similarity networks have a significant difference across the said countries. This, as a whole, is capable of predicting the geographical shift of SARS-CoV-2.
Full text:
Available
Collection:
Preprints
Database:
bioRxiv
Language:
English
Year:
2020
Document Type:
Preprint
Similar
MEDLINE
...
LILACS
LIS