Your browser doesn't support javascript.
Alignment-free sequence comparison for virus genomes based on location correlation coefficient.
He, Lily; Sun, Siyang; Zhang, Qianyue; Bao, Xiaona; Li, Peter K.
  • He L; School of Science, Beijing University of Civil Engineering and Architecture, Beijing 102616, PR China. Electronic address: lilyhe6@163.com.
  • Sun S; The High School Affiliated to Renmin University of China, Beijing 100080, PR China.
  • Zhang Q; The High School Affiliated to Renmin University of China, Beijing 100080, PR China.
  • Bao X; School of Science, Beijing University of Civil Engineering and Architecture, Beijing 102616, PR China.
  • Li PK; School of Life Sciences, Tsinghua University, Beijing 100084, PR China. Electronic address: peter_wondrrt@outlook.com.
Infect Genet Evol ; 96: 105106, 2021 12.
Article in English | MEDLINE | ID: covidwho-1506080
ABSTRACT
Coronaviruses (especially SARS-CoV-2) are characterized by rapid mutation and wide spread. As these characteristics easily lead to global pandemics, studying the evolutionary relationship between viruses is essential for clinical diagnosis. DNA sequencing has played an important role in evolutionary analysis. Recent alignment-free methods can overcome the problems of traditional alignment-based methods, which consume both time and space. This paper proposes a novel alignment-free method called the correlation coefficient feature vector (CCFV), which defines a correlation measure of the L-step delay of a nucleotide location from its location in the original DNA sequence. The numerical feature is a 16×L-dimensional numerical vector describing the distribution characteristics of the nucleotide positions in a DNA sequence. The proposed L-step delay correlation measure is interestingly related to some types of L+1 spaced mers. Unlike traditional gene comparison, our method avoids the computational complexity of multiple sequence alignment, and hence improves the speed of sequence comparison. Our method is applied to evolutionary analysis of the common human viruses including SARS-CoV-2, Dengue virus, Hepatitis B virus, and human rhinovirus and achieves the same or even better results than alignment-based methods. Especially for SARS-CoV-2, our method also confirms that bats are potential intermediate hosts of SARS-CoV-2.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Phylogeny / Genome, Viral / Sequence Analysis, DNA Type of study: Prognostic study / Randomized controlled trials Limits: Humans Language: English Journal: Infect Genet Evol Journal subject: Biology / Communicable Diseases / Genetics Year: 2021 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Phylogeny / Genome, Viral / Sequence Analysis, DNA Type of study: Prognostic study / Randomized controlled trials Limits: Humans Language: English Journal: Infect Genet Evol Journal subject: Biology / Communicable Diseases / Genetics Year: 2021 Document Type: Article