Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
2.
Nat Microbiol ; 7(1): 108-119, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34907347

RESUMO

The global spread and continued evolution of SARS-CoV-2 has driven an unprecedented surge in viral genomic surveillance. Amplicon-based sequencing methods provide a sensitive, low-cost and rapid approach but suffer a high potential for contamination, which can undermine laboratory processes and results. This challenge will increase with the expanding global production of sequences across a variety of laboratories for epidemiological and clinical interpretation, as well as for genomic surveillance of emerging diseases in future outbreaks. We present SDSI + AmpSeq, an approach that uses 96 synthetic DNA spike-ins (SDSIs) to track samples and detect inter-sample contamination throughout the sequencing workflow. We apply SDSIs to the ARTIC Consortium's amplicon design, demonstrate their utility and efficiency in a real-time investigation of a suspected hospital cluster of SARS-CoV-2 cases and validate them across 6,676 diagnostic samples at multiple laboratories. We establish that SDSI + AmpSeq provides increased confidence in genomic data by detecting and correcting for relatively common, yet previously unobserved modes of error, including spillover and sample swaps, without impacting genome recovery.


Assuntos
Primers do DNA/normas , SARS-CoV-2/genética , Análise de Sequência/normas , COVID-19/diagnóstico , Primers do DNA/síntese química , Genoma Viral/genética , Humanos , Controle de Qualidade , RNA Viral/genética , Reprodutibilidade dos Testes , Análise de Sequência/métodos , Sequenciamento Completo do Genoma , Fluxo de Trabalho
3.
BMC Genomics ; 21(1): 863, 2020 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-33276717

RESUMO

BACKGROUND: The global COVID-19 pandemic has led to an urgent need for scalable methods for clinical diagnostics and viral tracking. Next generation sequencing technologies have enabled large-scale genomic surveillance of SARS-CoV-2 as thousands of isolates are being sequenced around the world and deposited in public data repositories. A number of methods using both short- and long-read technologies are currently being applied for SARS-CoV-2 sequencing, including amplicon approaches, metagenomic methods, and sequence capture or enrichment methods. Given the small genome size, the ability to sequence SARS-CoV-2 at scale is limited by the cost and labor associated with making sequencing libraries. RESULTS: Here we describe a low-cost, streamlined, all amplicon-based method for sequencing SARS-CoV-2, which bypasses costly and time-consuming library preparation steps. We benchmark this tailed amplicon method against both the ARTIC amplicon protocol and sequence capture approaches and show that an optimized tailed amplicon approach achieves comparable amplicon balance, coverage metrics, and variant calls to the ARTIC v3 approach. CONCLUSIONS: The tailed amplicon method we describe represents a cost-effective and highly scalable method for SARS-CoV-2 sequencing.


Assuntos
Teste de Ácido Nucleico para COVID-19/métodos , COVID-19/virologia , Genoma Viral/genética , SARS-CoV-2/genética , Benchmarking , COVID-19/diagnóstico , COVID-19/epidemiologia , Teste de Ácido Nucleico para COVID-19/normas , Humanos , Epidemiologia Molecular , Mutação , RNA Viral/genética , SARS-CoV-2/isolamento & purificação , Análise de Sequência/métodos , Análise de Sequência/normas
4.
Gigascience ; 9(3)2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-32170312

RESUMO

BACKGROUND: Over the past few years the variety of experimental designs and protocols for sequencing experiments increased greatly. To ensure the wide usability of the produced data beyond an individual project, rich and systematic annotation of the underlying experiments is crucial. FINDINGS: We first developed an annotation structure that captures the overall experimental design as well as the relevant details of the steps from the biological sample to the library preparation, the sequencing procedure, and the sequencing and processed files. Through various design features, such as controlled vocabularies and different field requirements, we ensured a high annotation quality, comparability, and ease of annotation. The structure can be easily adapted to a large variety of species. We then implemented the annotation strategy in a user-hosted web platform with data import, query, and export functionality. CONCLUSIONS: We present here an annotation structure and user-hosted platform for sequencing experiment data, suitable for lab-internal documentation, collaborations, and large-scale annotation efforts.


Assuntos
Anotação de Sequência Molecular/métodos , Análise de Sequência/métodos , Software , Anotação de Sequência Molecular/normas , Análise de Sequência/normas
5.
Genes (Basel) ; 10(9)2019 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-31466373

RESUMO

Shotgun metagenomics using next generation sequencing (NGS) is a promising technique to analyze both DNA and RNA microbial material from patient samples. Mostly used in a research setting, it is now increasingly being used in the clinical realm as well, notably to support diagnosis of viral infections, thereby calling for quality control and the implementation of ring trials (RT) to benchmark pipelines and ensure comparable results. The Swiss NGS clinical virology community therefore decided to conduct a RT in 2018, in order to benchmark current metagenomic workflows used at Swiss clinical virology laboratories, and thereby contribute to the definition of common best practices. The RT consisted of two parts (increments), in order to disentangle the variability arising from the experimental compared to the bioinformatics parts of the laboratory pipeline. In addition, the RT was also designed to assess the impact of databases compared to bioinformatics algorithms on the final results, by asking participants to perform the bioinformatics analysis with a common database, in addition to using their own in-house database. Five laboratories participated in the RT (seven pipelines were tested). We observed that the algorithms had a stronger impact on the overall performance than the choice of the reference database. Our results also suggest that differences in sample preparation can lead to significant differences in the performance, and that laboratories should aim for at least 5-10 Mio reads per sample and use depth of coverage in addition to other interpretation metrics such as the percent of coverage. Performance was generally lower when increasing the number of viruses per sample. The lessons learned from this pilot study will be useful for the development of larger-scale RTs to serve as regular quality control tests for laboratories performing NGS analyses of viruses in a clinical setting.


Assuntos
Serviços de Laboratório Clínico/normas , Genoma Viral , Ensaio de Proficiência Laboratorial/métodos , Metagenoma , Metagenômica/normas , Análise de Sequência/normas , Genoma Humano , Humanos , Metagenômica/métodos , Análise de Sequência/métodos , Suíça
6.
Plant Dis ; 103(9): 2199-2203, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-31322493

RESUMO

Viral diseases are a limiting factor to wheat production. Viruses are difficult to diagnose in the early stages of disease development and are often confused with nutrient deficiencies or other abiotic problems. Immunological methods are useful to identify viruses, but specific antibodies may not be available or require high virus titer for detection. In 2015 and 2017, wheat plants containing Wheat streak mosaic virus (WSMV) resistance gene, Wsm2, were found to have symptoms characteristic of WSMV. Serologically, WSMV was detected in all four samples. Additionally, High Plains wheat mosaic virus (HPWMoV) was also detected in one of the samples. Barley yellow dwarf virus (BYDV) was not detected, and a detection kit was not readily available for Triticum mosaic virus (TriMV). Initially, cDNA cloning and Sanger sequencing were used to determine the presence of WSMV; however, the process was time-consuming and expensive. Subsequently, cDNA from infected wheat tissue was sequenced with single-strand, Oxford Nanopore sequencing technology (ONT). ONT was able to confirm the presence of WSMV. Additionally, TriMV was found in all of the samples and BYDV in three of the samples. Deep coverage sequencing of full-length, single-strand WSMV revealed variation compared with the WSMV Sidney-81 reference strain and may represent new variants which overcome Wsm2. These results demonstrate that ONT can more accurately identify causal virus agents and has sufficient resolution to provide evidence of causal variants.


Assuntos
Doenças das Plantas , Vírus de Plantas , Análise de Sequência , Triticum , Bunyaviridae/classificação , Bunyaviridae/genética , Luteovirus/classificação , Luteovirus/genética , Nanoporos , Doenças das Plantas/virologia , Vírus de Plantas/classificação , Vírus de Plantas/genética , Potyviridae/classificação , Potyviridae/genética , Análise de Sequência/normas , Triticum/virologia
7.
Breast ; 45: 29-35, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-30822622

RESUMO

Multigene panel testing for breast and ovarian cancer predisposition diagnosis is a useful tool as it makes possible to sequence a considerable number of genes in a large number of individuals. More than 200 different multigene panels in which the two major BRCA1 and BRCA2 breast cancer predisposing genes are included are proposed by public or commercial laboratories. We review the clinical validity and clinical utility of the 26 genes most oftenly included in these panels. Because clinical validity and utility are not established for all genes and due to the heterogeneity of tumour risk levels, there is a substantial difficulty in the routine use of multigene panels if management guidelines and recommendations for testing relatives are not previously defined for each gene. Besides, the classification of variant of unknown significance (VUS) is a particular limitation and challenge. Efforts to classify VUSs and also to identify factors that modify cancer risks are now needed to produce personalised risk estimates. The complexity of information, the capacity to come back to patients when VUS are re-classified as pathogenic, and the expected large increase in the number of individuals to be tested especially when the aim of multigene panel testing is not only prevention but also treatment are challenging both for physicians and patients. Quality of tests, interpretation of results, information and accompaniment of patients must be at the heart of the guidelines of multigene panel testing.


Assuntos
Neoplasias da Mama/genética , Detecção Precoce de Câncer/normas , Predisposição Genética para Doença , Testes Genéticos/normas , Análise de Sequência/normas , Biomarcadores Tumorais/genética , Detecção Precoce de Câncer/métodos , Feminino , Genes BRCA1 , Genes BRCA2 , Testes Genéticos/métodos , Variação Genética , Humanos , Neoplasias Ovarianas/genética , Reprodutibilidade dos Testes , Análise de Sequência/métodos
8.
Nucleic Acids Res ; 47(D1): D721-D728, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30289549

RESUMO

One of the most fundamental questions in biology is what types of cells form different tissues and organs in a functionally coordinated fashion. Larger-scale single-cell sequencing and biology experiment studies are now rapidly opening up new ways to track this question by revealing substantial cell markers for distinguishing different cell types in tissues. Here, we developed the CellMarker database (http://biocc.hrbmu.edu.cn/CellMarker/ or http://bio-bigdata.hrbmu.edu.cn/CellMarker/), aiming to provide a comprehensive and accurate resource of cell markers for various cell types in tissues of human and mouse. By manually curating over 100 000 published papers, 4124 entries including the cell marker information, tissue type, cell type, cancer information and source, were recorded. At last, 13 605 cell markers of 467 cell types in 158 human tissues/sub-tissues and 9148 cell makers of 389 cell types in 81 mouse tissues/sub-tissues were collected and deposited in CellMarker. CellMarker provides a user-friendly interface for browsing, searching and downloading markers of diverse cell types of different tissues. Furthermore, a summarized marker prevalence in each cell type is graphically and intuitively presented through a vivid statistical graph. We believe that CellMarker is a comprehensive and valuable resource for cell researches in precisely identifying and characterizing cells, especially at the single-cell level.


Assuntos
Bases de Dados Genéticas , Análise de Sequência/métodos , Análise de Célula Única/métodos , Software , Animais , Humanos , Camundongos , Análise de Sequência/normas , Análise de Célula Única/normas
9.
Gigascience ; 6(8): 1-11, 2017 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-28637310

RESUMO

Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine research, we summarize essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community, but greater awareness and adoption is still needed. We emphasize the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/normas , Metagenômica/métodos , Metagenômica/normas , Mineração de Dados/métodos , Mineração de Dados/normas , Bases de Dados Genéticas , Metagenoma , Análise de Sequência/métodos , Análise de Sequência/normas , Fluxo de Trabalho
11.
ACS Synth Biol ; 5(6): 449-51, 2016 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-27267452

RESUMO

Research is communicated more effectively and reproducibly when articles depict genetic designs consistently and fully disclose the complete sequences of all reported constructs. ACS Synthetic Biology is now providing authors with updated guidance and piloting a new tool and publication workflow that facilitate compliance with these recommended practices and standards for visual representation and data exchange.


Assuntos
Genética/normas , Editoração/normas , Pesquisa/normas , Análise de Sequência/normas , Biologia Sintética/normas , Humanos , Fluxo de Trabalho
12.
Methods Mol Biol ; 1418: 3-17, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27008007

RESUMO

Next-generation sequencing experiment can generate billions of short reads for each sample and processing of the raw reads will add more information. Various file formats have been introduced/developed in order to store and manipulate this information. This chapter presents an overview of the file formats including FASTQ, FASTA, SAM/BAM, GFF/GTF, BED, and VCF that are commonly used in analysis of next-generation sequencing data.


Assuntos
Dados de Sequência Molecular , Análise de Sequência/métodos , Análise de Sequência/normas , Biologia Computacional/métodos , Biologia Computacional/normas , Genômica/métodos , Alinhamento de Sequência/métodos , Alinhamento de Sequência/normas
13.
Methods Mol Biol ; 1418: 39-66, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27008009

RESUMO

Once a biochemical method has been devised to sample RNA or DNA of interest, sequencing can be used to identify the sampled molecules with high fidelity and low bias. High-throughput sequencing has therefore become the primary data acquisition method for many genomics studies and is being used more and more to address molecular biology questions. By applying principles of statistical experimental design, sequencing experiments can be made more sensitive to the effects under study as well as more biologically sound, hence more replicable.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Projetos de Pesquisa , Análise de Sequência , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Análise de Sequência/métodos , Análise de Sequência/normas , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Análise de Sequência de RNA/métodos , Análise de Sequência de RNA/normas
14.
Genome Med ; 7: 121, 2015 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-26589402

RESUMO

High-throughput sequencing of B-cell immunoglobulin repertoires is increasingly being applied to gain insights into the adaptive immune response in healthy individuals and in those with a wide range of diseases. Recent applications include the study of autoimmunity, infection, allergy, cancer and aging. As sequencing technologies continue to improve, these repertoire sequencing experiments are producing ever larger datasets, with tens- to hundreds-of-millions of sequences. These data require specialized bioinformatics pipelines to be analyzed effectively. Numerous methods and tools have been developed to handle different steps of the analysis, and integrated software suites have recently been made available. However, the field has yet to converge on a standard pipeline for data processing and analysis. Common file formats for data sharing are also lacking. Here we provide a set of practical guidelines for B-cell receptor repertoire sequencing analysis, starting from raw sequencing reads and proceeding through pre-processing, determination of population structure, and analysis of repertoire properties. These include methods for unique molecular identifiers and sequencing error correction, V(D)J assignment and detection of novel alleles, clonal assignment, lineage tree construction, somatic hypermutation modeling, selection analysis, and analysis of stereotyped or convergent responses. The guidelines presented here highlight the major steps involved in the analysis of B-cell repertoire sequencing data, along with recommendations on how to avoid common pitfalls.


Assuntos
Receptores de Antígenos de Linfócitos B/genética , Análise de Sequência/métodos , Análise de Sequência/normas , Biologia Computacional/métodos , Biologia Computacional/normas , Guias como Assunto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos
15.
PLoS One ; 10(3): e0119123, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25741706

RESUMO

Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, referred to as Primer IDs, before PCR and sequencing these errors could theoretically be removed. Here we evaluated the Primer ID methodology on 257,846 UDPS reads generated from a HIV-1 SG3Δenv plasmid clone and plasma samples from three HIV-infected patients. The Primer ID consisted of 11 randomized nucleotides, 4,194,304 combinations, in the primer for cDNA synthesis that introduced a unique sequence tag into each cDNA molecule. Consensus template sequences were constructed for reads with Primer IDs that were observed three or more times. Despite high numbers of input template molecules, the number of consensus template sequences was low. With 10,000 input molecules for the clone as few as 97 consensus template sequences were obtained due to highly skewed frequency of resampling. Furthermore, the number of sequenced templates was overestimated due to PCR errors in the Primer IDs. Finally, some consensus template sequences were erroneous due to hotspots for UDPS errors. The Primer ID methodology has the potential to provide highly accurate deep sequencing. However, it is important to be aware that there are remaining challenges with the methodology. In particular it is important to find ways to obtain a more even frequency of resampling of template molecules as well as to identify and remove artefactual consensus template sequences that have been generated by PCR errors in the Primer IDs.


Assuntos
Análise de Sequência/métodos , Sequência de Bases , Primers do DNA , HIV-1/genética , Dados de Sequência Molecular , Reação em Cadeia da Polimerase , Análise de Sequência/normas , Homologia de Sequência do Ácido Nucleico
16.
PLoS One ; 9(8): e104579, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25144537

RESUMO

The wide availability of whole-genome sequencing (WGS) and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs) in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs) are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps) are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i) depth of sequencing coverage, ii) choice of reference-guided short-read sequence assembler, iii) choice of reference genome, and iv) whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT), using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming). We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers should test a variety of conditions to achieve optimal results.


Assuntos
Listeria monocytogenes/genética , Análise de Sequência/normas , Genoma Bacteriano/genética , Polimorfismo de Nucleotídeo Único/genética , Software
17.
PLoS One ; 9(5): e97038, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24810421

RESUMO

The bird cherry-oat aphid (Rhopalosiphum padi), an important pest of cereal crops, not only directly sucks sap from plants, but also transmits a number of plant viruses, collectively the yellow dwarf viruses (YDVs). For quantifying changes in gene expression in vector aphids, reverse transcription-quantitative polymerase chain reaction (RT-qPCR) is a touchstone method, but the selection and validation of housekeeping genes (HKGs) as reference genes to normalize the expression level of endogenous genes of the vector and for exogenous genes of the virus in the aphids is critical to obtaining valid results. Such an assessment has not been done, however, for R. padi and YDVs. Here, we tested three algorithms (GeNorm, NormFinder and BestKeeper) to assess the suitability of candidate reference genes (EF-1α, ACT1, GAPDH, 18S rRNA) in 6 combinations of YDV and vector aphid morph. EF-1α and ACT1 together or in combination with GAPDH or with GAPDH and 18S rRNA could confidently be used to normalize virus titre and expression levels of endogenous genes in winged or wingless R. padi infected with Barley yellow dwarf virus isolates (BYDV)-PAV and BYDV-GAV. The use of only one reference gene, whether the most stably expressed (EF-1α) or the least stably expressed (18S rRNA), was not adequate for obtaining valid relative expression data from the RT-qPCR. Because of discrepancies among values for changes in relative expression obtained using 3 regions of the same gene, different regions of an endogenous aphid gene, including each terminus and the middle, should be analyzed at the same time with RT-qPCR. Our results highlight the necessity of choosing the best reference genes to obtain valid experimental data and provide several HKGs for relative quantification of virus titre in YDV-viruliferous aphids.


Assuntos
Afídeos/genética , Afídeos/virologia , Perfilação da Expressão Gênica/normas , Genes de Insetos/genética , Luteovirus/fisiologia , Reação em Cadeia da Polimerase Via Transcriptase Reversa/normas , Análise de Sequência/normas , Algoritmos , Animais , Genes Essenciais/genética , Padrões de Referência , Reprodutibilidade dos Testes , Carga Viral
18.
J Mol Diagn ; 16(3): 283-7, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24650895

RESUMO

This Perspectives article describes methods-based proficiency testing (MBPT), the benefits and limitations of MBPT, why the time is right for MBPT in molecular diagnostics, and how MBPT for next-generation sequencing is being developed by the College of American Pathologists.


Assuntos
Serviços de Laboratório Clínico , Testes Genéticos/métodos , Ensaio de Proficiência Laboratorial/métodos , Serviços de Laboratório Clínico/normas , Biblioteca Gênica , Testes Genéticos/normas , Humanos , Ensaio de Proficiência Laboratorial/normas , Análise de Sequência/métodos , Análise de Sequência/normas , Estudos de Validação como Assunto , Fluxo de Trabalho
20.
BMC Evol Biol ; 13: 161, 2013 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-23914788

RESUMO

The intention of this editorial is to steer researchers through methodological choices in molecular evolution, drawing on the combined expertise of the authors. Our aim is not to review the most advanced methods for a specific task. Rather, we define several general guidelines to help with methodology choices at different stages of a typical phylogenetic 'pipeline'. We are not able to provide exhaustive citation of a literature that is vast and plentiful, but we point the reader to a set of classical textbooks that reflect the state-of-the-art. We do not wish to appear overly critical of outdated methodology but rather provide some practical guidance on the sort of issues which should be considered. We stress that a reported study should be well-motivated and evaluate a specific hypothesis or scientific question. However, a publishable study should not be merely a compilation of available sequences for a protein family of interest followed by some standard analyses, unless it specifically addresses a scientific hypothesis or question. The rapid pace at which sequence data accumulate quickly outdates such publications. Although clearly, discoveries stemming from data mining, reports of new tools and databases and review papers are also desirable.


Assuntos
Classificação/métodos , Filogenia , Genética Populacional , Análise de Sequência/normas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...