Your browser doesn't support javascript.
Stability of SARS-CoV-2 phylogenies.
Turakhia, Yatish; De Maio, Nicola; Thornlow, Bryan; Gozashti, Landen; Lanfear, Robert; Walker, Conor R; Hinrichs, Angie S; Fernandes, Jason D; Borges, Rui; Slodkowicz, Greg; Weilguny, Lukas; Haussler, David; Goldman, Nick; Corbett-Detig, Russell.
  • Turakhia Y; Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • De Maio N; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Thornlow B; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom.
  • Gozashti L; Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Lanfear R; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Walker CR; Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Hinrichs AS; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Fernandes JD; Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States of America.
  • Borges R; Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT, Australia.
  • Slodkowicz G; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom.
  • Weilguny L; Department of Genetics, University of Cambridge, Cambridge, United Kingdom.
  • Haussler D; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Goldman N; Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America.
  • Corbett-Detig R; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America.
PLoS Genet ; 16(11): e1009175, 2020 11.
Article in English | MEDLINE | ID: covidwho-1388878
ABSTRACT
The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolutionary relationships among viral samples and to study key biological questions, including whether host viral genome editing and recombination are features of SARS-CoV-2 evolution. This global sequencing effort is inherently decentralized and must rely on data collected by many labs using a wide variety of molecular and bioinformatic techniques. There is thus a strong possibility that systematic errors associated with lab-or protocol-specific practices affect some sequences in the repositories. We find that some recurrent mutations in reported SARS-CoV-2 genome sequences have been observed predominantly or exclusively by single labs, co-localize with commonly used primer binding sites and are more likely to affect the protein-coding sequences than other similarly recurrent mutations. We show that their inclusion can affect phylogenetic inference on scales relevant to local lineage tracing, and make it appear as though there has been an excess of recurrent mutation or recombination among viral lineages. We suggest how samples can be screened and problematic variants removed, and we plan to regularly inform the scientific community with our updated results as more SARS-CoV-2 genome sequences are shared (https//virological.org/t/issues-with-sars-cov-2-sequencing-data/473 and https//virological.org/t/masking-strategies-for-sars-cov-2-alignments/480). We also develop tools for comparing and visualizing differences among very large phylogenies and we show that consistent clade- and tree-based comparisons can be made between phylogenies produced by different groups. These will facilitate evolutionary inferences and comparisons among phylogenies produced for a wide array of purposes. Building on the SARS-CoV-2 Genome Browser at UCSC, we present a toolkit to compare, analyze and combine SARS-CoV-2 phylogenies, find and remove potential sequencing errors and establish a widely shared, stable clade structure for a more accurate scientific inference and discourse.
Subject(s)

Full text: Available Collection: International databases Database: MEDLINE Main subject: Phylogeny / Genome, Viral / SARS-CoV-2 Type of study: Experimental Studies / Randomized controlled trials / Systematic review/Meta Analysis Topics: Variants Limits: Humans Language: English Journal: PLoS Genet Journal subject: Genetics Year: 2020 Document Type: Article Affiliation country: JOURNAL.PGEN.1009175

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Phylogeny / Genome, Viral / SARS-CoV-2 Type of study: Experimental Studies / Randomized controlled trials / Systematic review/Meta Analysis Topics: Variants Limits: Humans Language: English Journal: PLoS Genet Journal subject: Genetics Year: 2020 Document Type: Article Affiliation country: JOURNAL.PGEN.1009175