Your browser doesn't support javascript.
Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses.
Gibson, Keylie M; Steiner, Margaret C; Rentia, Uzma; Bendall, Matthew L; Pérez-Losada, Marcos; Crandall, Keith A.
  • Gibson KM; Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
  • Steiner MC; Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
  • Rentia U; Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
  • Bendall ML; Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
  • Pérez-Losada M; Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
  • Crandall KA; Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
Viruses ; 12(7)2020 07 14.
Article in English | MEDLINE | ID: covidwho-1389516
ABSTRACT
Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resolution of NGS data, we developed HAplotype PHylodynamics PIPEline (HAPHPIPE), an open-source tool for the de novo and reference-based assembly of viral NGS data, with both consensus sequence assembly and a focus on the quantification of intra-host variation through haplotype reconstruction. We validate and compare the consensus sequence assembly methods of HAPHPIPE to those of two alternative software packages, HyDRA and Geneious, using simulated HIV and empirical HIV, HCV, and SARS-CoV-2 datasets. Our validation methods included read mapping, genetic distance, and genetic diversity metrics. In simulated NGS data, HAPHPIPE generated pol consensus sequences significantly closer to the true consensus sequence than those produced by HyDRA and Geneious and performed comparably to Geneious for HIV gp120 sequences. Furthermore, using empirical data from multiple viruses, we demonstrate that HAPHPIPE can analyze larger sequence datasets due to its greater computational speed. Therefore, we contend that HAPHPIPE provides a more user-friendly platform for users with and without bioinformatics experience to implement current best practices for viral NGS assembly than other currently available options.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Viruses / Computational Biology / High-Throughput Nucleotide Sequencing Type of study: Prognostic study Topics: Variants Limits: Humans Language: English Year: 2020 Document Type: Article Affiliation country: V12070758

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Viruses / Computational Biology / High-Throughput Nucleotide Sequencing Type of study: Prognostic study Topics: Variants Limits: Humans Language: English Year: 2020 Document Type: Article Affiliation country: V12070758