RESUMO
In viral infections often multiple related viral strains are present, due to coinfection or within-host evolution. We describe Haploflow, a de Bruijn graph-based assembler for de novo genome assembly of viral strains from mixed sequence samples using a novel flow algorithm. We assessed Haploflow across multiple benchmark data sets of increasing complexity, showing that Haploflow is faster and more accurate than viral haplotype assemblers and generic metagenome assemblers not aiming to reconstruct strains. Haplotype reconstructed high-quality strain-resolved assemblies from clinical HCMV samples and SARS-CoV-2 genomes from wastewater metagenomes identical to genomes from clinical isolates.
RESUMO
The complete genome sequences of two asparagus virus 1 (AV-1) isolates differing in their ability to cause systemic infection in Nicotiana benthamiana were determined. Their genomes had 9,741 nucleotides excluding the 3'-terminal poly(A) tail, encoded a polyprotein of 3,112 amino acids, and shared 99.6 % nucleotide sequence identity. They differed at 37 nucleotide and 15 amino acid sequence positions (99.5 % identity) scattered over the polyprotein. The closest relatives of AV-1 in amino acid sequence identity were plum pox virus (54 %) and turnip mosaic virus (53 %), corroborating the classification of AV-1 as a member of a distinct species in the genus Potyvirus.