Your browser doesn't support javascript.
Highly accurate whole-genome imputation of SARS-CoV-2 from partial or low-quality sequences.
Ortuño, Francisco M; Loucera, Carlos; Casimiro-Soriguer, Carlos S; Lepe, Jose A; Camacho Martinez, Pedro; Merino Diaz, Laura; de Salazar, Adolfo; Chueca, Natalia; García, Federico; Perez-Florido, Javier; Dopazo, Joaquin.
  • Ortuño FM; Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain.
  • Loucera C; Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013 Sevilla, Spain.
  • Casimiro-Soriguer CS; Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain.
  • Lepe JA; Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013 Sevilla, Spain.
  • Camacho Martinez P; Clinical Bioinformatics Area, Fundación Progreso y Salud (FPS), CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain.
  • Merino Diaz L; Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013 Sevilla, Spain.
  • de Salazar A; Unidad Clínica Enfermedades Infecciosas, Microbiología y Medicina Preventiva, Hospital Universitario Virgen del Rocío, 41013 Sevilla, Spain.
  • Chueca N; Unidad Clínica Enfermedades Infecciosas, Microbiología y Medicina Preventiva, Hospital Universitario Virgen del Rocío, 41013 Sevilla, Spain.
  • García F; Unidad Clínica Enfermedades Infecciosas, Microbiología y Medicina Preventiva, Hospital Universitario Virgen del Rocío, 41013 Sevilla, Spain.
  • Perez-Florido J; Servicio de Microbiología, Hospital Universitario San Cecilio, 18016 Granada, Spain.
  • Dopazo J; Servicio de Microbiología, Hospital Universitario San Cecilio, 18016 Granada, Spain.
Gigascience ; 10(12)2021 12 02.
Article in English | MEDLINE | ID: covidwho-1550549
ABSTRACT

BACKGROUND:

The current SARS-CoV-2 pandemic has emphasized the utility of viral whole-genome sequencing in the surveillance and control of the pathogen. An unprecedented ongoing global initiative is producing hundreds of thousands of sequences worldwide. However, the complex circumstances in which viruses are sequenced, along with the demand of urgent results, causes a high rate of incomplete and, therefore, useless sequences. Viral sequences evolve in the context of a complex phylogeny and different positions along the genome are in linkage disequilibrium. Therefore, an imputation method would be able to predict missing positions from the available sequencing data.

RESULTS:

We have developed the impuSARS application, which takes advantage of the enormous number of SARS-CoV-2 genomes available, using a reference panel containing 239,301 sequences, to produce missing data imputation in viral genomes. ImpuSARS was tested in a wide range of conditions (continuous fragments, amplicons or sparse individual positions missing), showing great fidelity when reconstructing the original sequences, recovering the lineage with a 100% precision for almost all the lineages, even in very poorly covered genomes (<20%).

CONCLUSIONS:

Imputation can improve the pace of SARS-CoV-2 sequencing production by recovering many incomplete or low-quality sequences that would be otherwise discarded. ImpuSARS can be incorporated in any primary data processing pipeline for SARS-CoV-2 whole-genome sequencing.
Subject(s)

Full text: Available Collection: International databases Database: MEDLINE Main subject: Genome, Viral / SARS-CoV-2 Type of study: Prognostic study / Randomized controlled trials Language: English Year: 2021 Document Type: Article Affiliation country: Gigascience

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Genome, Viral / SARS-CoV-2 Type of study: Prognostic study / Randomized controlled trials Language: English Year: 2021 Document Type: Article Affiliation country: Gigascience