Pesquisa | Portal Regional da BVS (teste)

1.

Recommendations for Uniform Variant Calling of SARS-CoV-2 Genome Sequence across Bioinformatic Workflows.

Connor, Ryan; Shakya, Migun; Yarmosh, David A; Maier, Wolfgang; Martin, Ross; Bradford, Rebecca; Brister, J Rodney; Chain, Patrick S G; Copeland, Courtney A; di Iulio, Julia; Hu, Bin; Ebert, Philip; Gunti, Jonathan; Jin, Yumi; Katz, Kenneth S; Kochergin, Andrey; LaRosa, Tré; Li, Jiani; Li, Po-E; Lo, Chien-Chi; Rashid, Sujatha; Maiorova, Evguenia S; Xiao, Chunlin; Zalunin, Vadim; Purcell, Lisa; Pruitt, Kim D.

Viruses ; 16(3)2024 03 11.

Artigo em Inglês | MEDLINE | ID: mdl-38543795

RESUMO

Genomic sequencing of clinical samples to identify emerging variants of SARS-CoV-2 has been a key public health tool for curbing the spread of the virus. As a result, an unprecedented number of SARS-CoV-2 genomes were sequenced during the COVID-19 pandemic, which allowed for rapid identification of genetic variants, enabling the timely design and testing of therapies and deployment of new vaccine formulations to combat the new variants. However, despite the technological advances of deep sequencing, the analysis of the raw sequence data generated globally is neither standardized nor consistent, leading to vastly disparate sequences that may impact identification of variants. Here, we show that for both Illumina and Oxford Nanopore sequencing platforms, downstream bioinformatic protocols used by industry, government, and academic groups resulted in different virus sequences from same sample. These bioinformatic workflows produced consensus genomes with differences in single nucleotide polymorphisms, inclusion and exclusion of insertions, and/or deletions, despite using the same raw sequence as input datasets. Here, we compared and characterized such discrepancies and propose a specific suite of parameters and protocols that should be adopted across the field. Consistent results from bioinformatic workflows are fundamental to SARS-CoV-2 and future pathogen surveillance efforts, including pandemic preparation, to allow for a data-driven and timely public health response.

Assuntos

COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiologia , Pandemias , Fluxo de Trabalho , Biologia Computacional

2.

The ATCC genome portal: 3,938 authenticated microbial reference genomes.

Nguyen, Scott V; Puthuveetil, Nikhita P; Petrone, Joseph R; Kirkland, Jade L; Gaffney, Kaitlyn; Tabron, Corina L; Wax, Noah; Duncan, James; King, Stephen; Marlow, Robert; Reese, Amy L; Yarmosh, David A; McConnell, Hannah H; Fernandes, Ana S; Bagnoli, John; Benton, Briana; Jacobs, Jonathan L.

Microbiol Resour Announc ; 13(2): e0104523, 2024 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-38289057

RESUMO

The ATCC Genome Portal (AGP, https://genomes.atcc.org/) is a database of authenticated genomes for bacteria, fungi, protists, and viruses held in ATCC's biorepository. It now includes 3,938 assemblies (253% increase) produced under ISO 9000 by ATCC. Here, we present new features and content added to the AGP for the research community.

3.

Development and Optimization of an Unbiased, Metagenomics-Based Pathogen Detection Workflow for Infectious Disease and Biosurveillance Applications.

Parker, Kyle; Wood, Hillary; Russell, Joseph A; Yarmosh, David; Shteyman, Alan; Bagnoli, John; Knight, Brittany; Aspinwall, Jacob R; Jacobs, Jonathan; Werking, Kristine; Winegar, Richard.

Trop Med Infect Dis ; 8(2)2023 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-36828537

RESUMO

Rapid, specific, and sensitive identification of microbial pathogens is critical to infectious disease diagnosis and surveillance. Classical culture-based methods can be applied to a broad range of pathogens but have long turnaround times. Molecular methods, such as PCR, are time-effective but are not comprehensive and may not detect novel strains. Metagenomic shotgun next-generation sequencing (NGS) promises specific identification and characterization of any pathogen (viruses, bacteria, fungi, and protozoa) in a less biased way. Despite its great potential, NGS has yet to be widely adopted by clinical microbiology laboratories due in part to the absence of standardized workflows. Here, we describe a sample-to-answer workflow called PanGIA (Pan-Genomics for Infectious Agents) that includes simplified, standardized wet-lab procedures and data analysis with an easy-to-use bioinformatics tool. PanGIA is an end-to-end, multi-use workflow that can be used for pathogen detection and related applications, such as biosurveillance and biothreat detection. We performed a comprehensive survey and assessment of current, commercially available wet-lab technologies and open-source bioinformatics tools for each workflow component. The workflow includes total nucleic acid extraction from clinical human whole blood and environmental microbial forensic swabs as sample inputs, host nucleic acid depletion, dual DNA and RNA library preparation, shotgun sequencing on an Illumina MiSeq, and sequencing data analysis. The PanGIA workflow can be completed within 24 h and is currently compatible with bacteria and viruses. Here, we present data from the development and application of the clinical and environmental workflows, enabling the specific detection of pathogens associated with bloodstream infections and environmental biosurveillance, without the need for targeted assay development.

4.

Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance.

Connor, Ryan; Yarmosh, David A; Maier, Wolfgang; Shakya, Migun; Martin, Ross; Bradford, Rebecca; Brister, J Rodney; Chain, Patrick Sg; Copeland, Courtney A; di Iulio, Julia; Hu, Bin; Ebert, Philip; Gunti, Jonathan; Jin, Yumi; Katz, Kenneth S; Kochergin, Andrey; LaRosa, Tré; Li, Jiani; Li, Po-E; Lo, Chien-Chi; Rashid, Sujatha; Maiorova, Evguenia S; Xiao, Chunlin; Zalunin, Vadim; Pruitt, Kim D.

bioRxiv ; 2022 Nov 03.

Artigo em Inglês | MEDLINE | ID: mdl-36380755

RESUMO

During the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate new viral genomes, it is critical that consensus and variant calling tools yield consistent results across disparate pipelines. Here we examine the impact of sequencing technologies (Illumina and Oxford Nanopore) and 7 different downstream bioinformatic protocols on SARS-CoV-2 variant calling as part of the NIH Accelerating COVID-19 Therapeutic Interventions and Vaccines (ACTIV) Tracking Resistance and Coronavirus Evolution (TRACE) initiative, a public-private partnership established to address the COVID-19 outbreak. Our results indicate that bioinformatic workflows can yield consensus genomes with different single nucleotide polymorphisms, insertions, and/or deletions even when using the same raw sequence input datasets. We introduce the use of a specific suite of parameters and protocols that greatly improves the agreement among pipelines developed by diverse organizations. Such consistency among bioinformatic pipelines is fundamental to SARS-CoV-2 and future pathogen surveillance efforts. The application of analysis standards is necessary to more accurately document phylogenomic trends and support data-driven public health responses.

5.

Comparative Analysis and Data Provenance for 1,113 Bacterial Genome Assemblies.

Yarmosh, David A; Lopera, Juan G; Puthuveetil, Nikhita P; Combs, Patrick Ford; Reese, Amy L; Tabron, Corina; Pierola, Amanda E; Duncan, James; Greenfield, Samuel R; Marlow, Robert; King, Stephen; Riojas, Marco A; Bagnoli, John; Benton, Briana; Jacobs, Jonathan L.

mSphere ; 7(3): e0007722, 2022 06 29.

Artigo em Inglês | MEDLINE | ID: mdl-35491842

RESUMO

The availability of public genomics data has become essential for modern life sciences research, yet the quality, traceability, and curation of these data have significant impacts on a broad range of microbial genomics research. While microbial genome databases such as NCBI's RefSeq database leverage the scalability of crowd sourcing for growth, genomics data provenance and authenticity of the source materials used to produce data are not strict requirements. Here, we describe the de novo assembly of 1,113 bacterial genome references produced from authenticated materials sourced from the American Type Culture Collection (ATCC), each with full genomics data provenance relating to bioinformatics methods, quality control, and passage history. Comparative genomics analysis of ATCC standard reference genomes (ASRGs) revealed significant issues with regard to NCBI's RefSeq bacterial genome assemblies related to completeness, mutations, structure, strain metadata, and gaps in traceability to the original biological source materials. Nearly half of RefSeq assemblies lack details on sample source information, sequencing technology, or bioinformatics methods. Deep curation of these records is not within the scope of NCBI's core mission in supporting open science, which aims to collect sequence records that are submitted by the public. Nonetheless, we propose that gaps in metadata accuracy and data provenance represent an "elephant in the room" for microbial genomics research. Effectively addressing these issues will require raising the level of accountability for data depositors and acknowledging the need for higher expectations of quality among the researchers whose research depends on accurate and attributable reference genome data. IMPORTANCE The traceability of microbial genomics data to authenticated physical biological materials is not a requirement for depositing these data into public genome databases. This creates significant risks for the reliability and data provenance of these important genomics research resources, the impact of which is not well understood. We sought to investigate this by carrying out a comparative genomics study of 1,113 ATCC standard reference genomes (ASRGs) produced by ATCC from authenticated and traceable materials using the latest sequencing technologies. We found widespread discrepancies in genome assembly quality, genetic variability, and the quality and completeness of the associated metadata among hundreds of reference genomes for ATCC strains found in NCBI's RefSeq database. We present a comparative analysis of de novo-assembled ASRGs, their respective metadata, and variant analysis using RefSeq genomes as a reference. Although assembly quality in RefSeq has generally improved over time, we found that significant quality issues remain, especially as related to genomic data and metadata provenance. Our work highlights the importance of data authentication and provenance for the microbial genomics community, and underscores the risks of ignoring this issue in the future.

Assuntos

Bases de Dados Genéticas , Genômica , Genoma Bacteriano , Genoma Microbiano , Reprodutibilidade dos Testes

6.

Propagation of SARS-CoV-2 in Calu-3 Cells to Eliminate Mutations in the Furin Cleavage Site of Spike.

Baczenas, John James; Andersen, Hanne; Rashid, Sujatha; Yarmosh, David; Puthuveetil, Nikhita; Parker, Michael; Bradford, Rebecca; Florence, Clint; Stemple, Kimberly J; Lewis, Mark G; O'Connor, Shelby L.

Viruses ; 13(12)2021 12 04.

Artigo em Inglês | MEDLINE | ID: mdl-34960703

RESUMO

SARS-CoV-2 pathogenesis, vaccine, and therapeutic studies rely on the use of animals challenged with highly pathogenic virus stocks produced in cell cultures. Ideally, these virus stocks should be genetically and functionally similar to the original clinical isolate, retaining wild-type properties to be reliably used in animal model studies. It is well-established that SARS-CoV-2 isolates serially passaged on Vero cell lines accumulate mutations and deletions in the furin cleavage site; however, these can be eliminated when passaged on Calu-3 lung epithelial cell lines, as presented in this study. As numerous stocks of SARS-CoV-2 variants of concern are being grown in cell cultures with the intent for use in animal models, it is essential that propagation methods generate virus stocks that are pathogenic in vivo. Here, we found that the propagation of a B.1.351 SARS-CoV-2 stock on Calu-3 cells eliminated viruses that previously accumulated mutations in the furin cleavage site. Notably, there were alternative variants that accumulated at the same nucleotide positions in virus populations grown on Calu-3 cells at multiple independent facilities. When a Calu-3-derived B.1.351 virus stock was used to infect hamsters, the virus remained pathogenic and the Calu-3-specific variants persisted in the population. These results suggest that Calu-3-derived virus stocks are pathogenic but care should still be taken to evaluate virus stocks for newly arising mutations during propagation.

Assuntos

SARS-CoV-2/crescimento & desenvolvimento , Inoculações Seriadas/métodos , Glicoproteína da Espícula de Coronavírus/genética , Animais , COVID-19/virologia , Linhagem Celular Tumoral , Chlorocebus aethiops , Cricetinae , Furina/metabolismo , Humanos , Mutação , SARS-CoV-2/genética , SARS-CoV-2/patogenicidade , Células Vero

7.

The ATCC Genome Portal: Microbial Genome Reference Standards with Data Provenance.

Benton, Briana; King, Stephen; Greenfield, Samuel R; Puthuveetil, Nikhita; Reese, Amy L; Duncan, James; Marlow, Robert; Tabron, Corina; Pierola, Amanda E; Yarmosh, David A; Combs, Patrick Ford; Riojas, Marco A; Bagnoli, John; Jacobs, Jonathan L.

Microbiol Resour Announc ; 10(47): e0081821, 2021 Nov 24.

Artigo em Inglês | MEDLINE | ID: mdl-34817215

RESUMO

Lack of data provenance negatively impacts scientific reproducibility and the reliability of genomic data. The ATCC Genome Portal (https://genomes.atcc.org) addresses this by providing data provenance information for microbial whole-genome assemblies originating from authenticated biological materials. To date, we have sequenced 1,579 complete genomes, including 466 type strains and 1,156 novel genomes.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA