Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 65
Filter
Add more filters










Publication year range
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38747283

ABSTRACT

The analysis and comparison of gene neighborhoods is a powerful approach for exploring microbial genome structure, function, and evolution. Although numerous tools exist for genome visualization and comparison, genome exploration across large genomic databases or user-generated datasets remains a challenge. Here, we introduce AnnoView, a web server designed for interactive exploration of gene neighborhoods across the bacterial and archaeal tree of life. Our server offers users the ability to identify, compare, and visualize gene neighborhoods of interest from 30 238 bacterial genomes and 1672 archaeal genomes, through integration with the comprehensive Genome Taxonomy Database and AnnoTree databases. Identified gene neighborhoods can be visualized using pre-computed functional annotations from different sources such as KEGG, Pfam and TIGRFAM, or clustered based on similarity. Alternatively, users can upload and explore their own custom genomic datasets in GBK, GFF or CSV format, or use AnnoView as a genome browser for relatively small genomes (e.g. viruses and plasmids). Ultimately, we anticipate that AnnoView will catalyze biological discovery by enabling user-friendly search, comparison, and visualization of genomic data. AnnoView is available at http://annoview.uwaterloo.ca.


Subject(s)
Software , Databases, Genetic , Genome, Bacterial , Genome, Archaeal , Genomics/methods , Archaea/genetics , Genes, Microbial/genetics , Computational Biology/methods , Bacteria/genetics , Bacteria/classification
2.
Microb Genom ; 10(5)2024 May.
Article in English | MEDLINE | ID: mdl-38809778

ABSTRACT

The Genome Taxonomy Database (GTDB) provides a species to domain classification of publicly available genomes based on average nucleotide identity (ANI) (for species) and a concatenated gene phylogeny normalized by evolutionary rates (for genus to phylum), which has been widely adopted by the scientific community. Here, we use the Genome UNClutterer (GUNC) software to identify putatively contaminated genomes in GTDB release 07-RS207. We found that GUNC reported 35,723 genomes as putatively contaminated, comprising 11.25 % of the 317,542 genomes in GTDB release 07-RS207. To assess the impact of this high level of inferred contamination on the delineation of taxa, we created 'clean' versions of the 34,846 putatively contaminated bacterial genomes by removing the most contaminated half. For each clean half, we re-calculated the ANI and concatenated gene phylogeny and found that only 77 (0.22 %) of the genomes were not consistent with their original classification. We conclude that the delineation of taxa in GTDB is robust to the putative contamination detected by GUNC.


Subject(s)
Bacteria , Genome, Bacterial , Phylogeny , Bacteria/genetics , Bacteria/classification , Software , Databases, Genetic , DNA Contamination
5.
FEMS Microbiol Lett ; 3702023 01 17.
Article in English | MEDLINE | ID: mdl-37480240

ABSTRACT

The Genome Taxonomy Database (GTDB) is a taxonomic framework that defines prokaryotic taxa as monophyletic groups in concatenated protein reference trees according to systematic criteria. This has resulted in a substantial number of changes to existing classifications (https://gtdb.ecogenomic.org). In the case of union of taxa, GTDB names were applied based on the priority of publication. The division of taxa or change in rank led to the formation of new Latin names above the rank of genus that were only made publicly available via the GTDB website without associated published taxonomic descriptions. This has sometimes led to confusion in the literature and databases. A number of the provisional GTDB names were later published in other studies, while many still lack authorships. To reduce further confusion, here we propose names and descriptions for 329 GTDB-defined prokaryotic taxa, 223 of which are suitable for validation under the International Code of Nomenclature of Prokaryotes (ICNP) and 49 under the Code of Nomenclature of Prokaryotes described from Sequence Data (SeqCode). For the latter, we designated 23 genomes as type material. An additional 57 taxa that do not currently satisfy the validation criteria of either code are proposed as Candidatus.


Subject(s)
Authorship , Prokaryotic Cells , Databases, Factual
6.
Nat Methods ; 20(8): 1203-1212, 2023 08.
Article in English | MEDLINE | ID: mdl-37500759

ABSTRACT

Advances in sequencing technologies and bioinformatics tools have dramatically increased the recovery rate of microbial genomes from metagenomic data. Assessing the quality of metagenome-assembled genomes (MAGs) is a critical step before downstream analysis. Here, we present CheckM2, an improved method of predicting genome quality of MAGs using machine learning. Using synthetic and experimental data, we demonstrate that CheckM2 outperforms existing tools in both accuracy and computational speed. In addition, CheckM2's database can be rapidly updated with new high-quality reference genomes, including taxa represented only by a single genome. We also show that CheckM2 accurately predicts genome quality for MAGs from novel lineages, even for those with reduced genome size (for example, Patescibacteria and the DPANN superphylum). CheckM2 provides accurate genome quality predictions across bacterial and archaeal lineages, giving increased confidence when inferring biological conclusions from MAGs.


Subject(s)
Bacteria , Genome, Microbial , Bacteria/genetics , Metagenome , Metagenomics/methods , Machine Learning
7.
Nat Biotechnol ; 2023 Jul 27.
Article in English | MEDLINE | ID: mdl-37500913

ABSTRACT

Studies using 16S rRNA and shotgun metagenomics typically yield different results, usually attributed to PCR amplification biases. We introduce Greengenes2, a reference tree that unifies genomic and 16S rRNA databases in a consistent, integrated resource. By inserting sequences into a whole-genome phylogeny, we show that 16S rRNA and shotgun metagenomic data generated from the same samples agree in principal coordinates space, taxonomy and phenotype effect size when analyzed with the same tree.

8.
Bioinformatics ; 38(23): 5315-5316, 2022 11 30.
Article in English | MEDLINE | ID: mdl-36218463

ABSTRACT

SUMMARY: The Genome Taxonomy Database (GTDB) and associated taxonomic classification toolkit (GTDB-Tk) have been widely adopted by the microbiology community. However, the growing size of the GTDB bacterial reference tree has resulted in GTDB-Tk requiring substantial amounts of memory (∼320 GB) which limits its adoption and ease of use. Here, we present an update to GTDB-Tk that uses a divide-and-conquer approach where user genomes are initially placed into a bacterial reference tree with family-level representatives followed by placement into an appropriate class-level subtree comprising species representatives. This substantially reduces the memory requirements of GTDB-Tk while having minimal impact on classification. AVAILABILITY AND IMPLEMENTATION: GTDB-Tk is implemented in Python and licenced under the GNU General Public Licence v3.0. Source code and documentation are available at: https://github.com/ecogenomics/gtdbtk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Documentation , Software
9.
Syst Appl Microbiol ; 45(5): 126305, 2022 Sep.
Article in English | MEDLINE | ID: mdl-36049255

ABSTRACT

Over the last fifteen years, genomics has become fully integrated into prokaryotic systematics. The genomes of most type strains have been sequenced, genome sequence similarity is widely used for delineation of species, and phylogenomic methods are commonly used for classification of higher taxonomic ranks. Additionally, environmental genomics has revealed a vast diversity of as-yet-uncultivated taxa. In response to these developments, a new code of nomenclature, the Code of Nomenclature of Prokaryotes Described from Sequence Data (SeqCode), has been developed over the last two years to allow naming of Archaea and Bacteria using DNA sequences as the nomenclatural types. The SeqCode also allows naming of cultured organisms, including fastidious prokaryotes that cannot be deposited into culture collections. Several simplifications relative to the International Code of Nomenclature of Prokaryotes (ICNP) are implemented to make nomenclature more accessible, easier to apply and more readily communicated. By simplifying nomenclature with the goal of a unified classification, inclusive of both cultured and uncultured taxa, the SeqCode will facilitate the naming of taxa in every biome on Earth, encourage the isolation and characterization of as-yet-uncultivated taxa, and promote synergies between the ecological, environmental, physiological, biochemical, and molecular biological disciplines to more fully describe prokaryotes.


Subject(s)
Archaea , Bacteria , Archaea/genetics , Bacteria/genetics , Base Sequence , Phylogeny , RNA, Ribosomal, 16S
10.
Nat Microbiol ; 7(10): 1702-1708, 2022 10.
Article in English | MEDLINE | ID: mdl-36123442

ABSTRACT

Most prokaryotes are not available as pure cultures and therefore ineligible for naming under the rules and recommendations of the International Code of Nomenclature of Prokaryotes (ICNP). Here we summarize the development of the SeqCode, a code of nomenclature under which genome sequences serve as nomenclatural types. This code enables valid publication of names of prokaryotes based upon isolate genome, metagenome-assembled genome or single-amplified genome sequences. Otherwise, it is similar to the ICNP with regard to the formation of names and rules of priority. It operates through the SeqCode Registry ( https://seqco.de/ ), a registration portal through which names and nomenclatural types are registered, validated and linked to metadata. We describe the two paths currently available within SeqCode to register and validate names, including Candidatus names, and provide examples for both. Recommendations on minimal standards for DNA sequences are provided. Thus, the SeqCode provides a reproducible and objective framework for the nomenclature of all prokaryotes regardless of cultivability and facilitates communication across microbiological disciplines.


Subject(s)
Metagenome , Prokaryotic Cells
11.
ISME J ; 16(11): 2525-2534, 2022 Nov.
Article in English | MEDLINE | ID: mdl-35915168

ABSTRACT

Heterotrophic bacterial diazotrophs (HBDs) are ubiquitous in the pelagic ocean, where they have been predicted to carry out the anaerobic process of nitrogen fixation within low-oxygen microenvironments associated with marine pelagic particles. However, the mechanisms enabling particle colonization by HBDs are unknown. We hypothesized that HBDs use chemotaxis to locate and colonize suitable microenvironments, and showed that a cultivated marine HBD is chemotactic toward amino acids and phytoplankton-derived DOM. Using an in situ chemotaxis assay, we also discovered that diverse HBDs at a coastal site are motile and chemotactic toward DOM from various phytoplankton taxa and, indeed, that the proportion of diazotrophs was up to seven times higher among the motile fraction of the bacterial community compared to the bulk seawater community. Finally, three of four HBD isolates and 16 of 17 HBD metagenome assembled genomes, recovered from major ocean basins and locations along the Australian coast, each encoded >85% of proteins affiliated with the bacterial chemotaxis pathway. These results document the widespread capacity for chemotaxis in diverse and globally relevant marine HBDs. We suggest that HBDs could use chemotaxis to seek out and colonize low-oxygen microenvironments suitable for nitrogen fixation, such as those formed on marine particles. Chemotaxis in HBDs could therefore affect marine nitrogen and carbon biogeochemistry by facilitating nitrogen fixation within otherwise oxic waters, while also altering particle degradation and the efficiency of the biological pump.


Subject(s)
Cyanobacteria , Nitrogen Fixation , Amino Acids/metabolism , Australia , Carbon/metabolism , Chemotaxis , Cyanobacteria/metabolism , Membrane Transport Proteins/metabolism , Nitrogen/metabolism , Oceans and Seas , Oxygen/metabolism , Phytoplankton/metabolism , Seawater/microbiology
12.
Nature ; 605(7908): 132-138, 2022 05.
Article in English | MEDLINE | ID: mdl-35444277

ABSTRACT

The capacity of planktonic marine microorganisms to actively seek out and exploit microscale chemical hotspots has been widely theorized to affect ocean-basin scale biogeochemistry1-3, but has never been examined comprehensively in situ among natural microbial communities. Here, using a field-based microfluidic platform to quantify the behavioural responses of marine bacteria and archaea, we observed significant levels of chemotaxis towards microscale hotspots of phytoplankton-derived dissolved organic matter (DOM) at a coastal field site across multiple deployments, spanning several months. Microscale metagenomics revealed that a wide diversity of marine prokaryotes, spanning 27 bacterial and 2 archaeal phyla, displayed chemotaxis towards microscale patches of DOM derived from ten globally distributed phytoplankton species. The distinct DOM composition of each phytoplankton species attracted phylogenetically and functionally discrete populations of bacteria and archaea, with 54% of chemotactic prokaryotes displaying highly specific responses to the DOM derived from only one or two phytoplankton species. Prokaryotes exhibiting chemotaxis towards phytoplankton-derived compounds were significantly enriched in the capacity to transport and metabolize specific phytoplankton-derived chemicals, and displayed enrichment in functions conducive to symbiotic relationships, including genes involved in the production of siderophores, B vitamins and growth-promoting hormones. Our findings demonstrate that the swimming behaviour of natural prokaryotic assemblages is governed by specific chemical cues, which dictate important biogeochemical transformation processes and the establishment of ecological interactions that structure the base of the marine food web.


Subject(s)
Chemotaxis , Microbiota , Bacteria , Dissolved Organic Matter , Oceans and Seas , Phytoplankton/metabolism , Seawater/microbiology
13.
Nucleic Acids Res ; 50(D1): D785-D794, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34520557

ABSTRACT

The Genome Taxonomy Database (GTDB; https://gtdb.ecogenomic.org) provides a phylogenetically consistent and rank normalized genome-based taxonomy for prokaryotic genomes sourced from the NCBI Assembly database. GTDB R06-RS202 spans 254 090 bacterial and 4316 archaeal genomes, a 270% increase since the introduction of the GTDB in November, 2017. These genomes are organized into 45 555 bacterial and 2339 archaeal species clusters which is a 200% increase since the integration of species clusters into the GTDB in June, 2019. Here, we explore prokaryotic diversity from the perspective of the GTDB and highlight the importance of metagenome-assembled genomes in expanding available genomic representation. We also discuss improvements to the GTDB website which allow tracking of taxonomic changes, easy assessment of genome assembly quality, and identification of genomes assembled from type material or used as species representatives. Methodological updates and policy changes made since the inception of the GTDB are then described along with the procedure used to update species clusters in the GTDB. We conclude with a discussion on the use of average nucleotide identities as a pragmatic approach for delineating prokaryotic species.


Subject(s)
Archaea/classification , Bacteria/classification , Databases, Genetic , Genome, Archaeal , Genome, Bacterial , Software , Archaea/genetics , Bacteria/genetics , Base Sequence , Internet , Metagenome , Phylogeny , Prokaryotic Cells/classification , Prokaryotic Cells/cytology , Prokaryotic Cells/metabolism
14.
Microbiome ; 9(1): 199, 2021 10 06.
Article in English | MEDLINE | ID: mdl-34615557

ABSTRACT

BACKGROUND: Microbial communities in both natural and applied settings reliably carry out myriads of functions, yet how stable these taxonomically diverse assemblages can be and what causes them to transition between states remains poorly understood. We studied monthly activated sludge (AS) samples collected over 9 years from a full-scale wastewater treatment plant to answer how complex AS communities evolve in the long term and how the community functions change when there is a disturbance in operational parameters. RESULTS: Here, we show that a microbial community in activated sludge (AS) system fluctuated around a stable average for 3 years but was then abruptly pushed into an alternative stable state by a simple transient disturbance (bleaching). While the taxonomic composition rapidly turned into a new state following the disturbance, the metabolic profile of the community and system performance remained remarkably stable. A total of 920 metagenome-assembled genomes (MAGs), representing approximately 70% of the community in the studied AS ecosystem, were recovered from the 97 monthly AS metagenomes. Comparative genomic analysis revealed an increased ability to aggregate in the cohorts of MAGs with correlated dynamics that are dominant after the bleaching event. Fine-scale analysis of dynamics also revealed cohorts that dominated during different periods and showed successional dynamics on seasonal and longer time scales due to temperature fluctuation and gradual changes in mean residence time in the reactor, respectively. CONCLUSIONS: Our work highlights that communities can assume different stable states under highly similar environmental conditions and that a specific disturbance threshold may lead to a rapid shift in community composition. Video Abstract.


Subject(s)
Microbiota , Sewage , Bacteria/genetics , Bioreactors , Metagenome , Microbiota/genetics
15.
Nat Commun ; 12(1): 5815, 2021 10 05.
Article in English | MEDLINE | ID: mdl-34611153

ABSTRACT

Northern post-glacial lakes are significant, increasing sources of atmospheric carbon through ebullition (bubbling) of microbially-produced methane (CH4) from sediments. Ebullitive CH4 flux correlates strongly with temperature, reflecting that solar radiation drives emissions. However, here we show that the slope of the temperature-CH4 flux relationship differs spatially across two post-glacial lakes in Sweden. We compared these CH4 emission patterns with sediment microbial (metagenomic and amplicon), isotopic, and geochemical data. The temperature-associated increase in CH4 emissions was greater in lake middles-where methanogens were more abundant-than edges, and sediment communities were distinct between edges and middles. Microbial abundances, including those of CH4-cycling microorganisms and syntrophs, were predictive of porewater CH4 concentrations. Results suggest that deeper lake regions, which currently emit less CH4 than shallower edges, could add substantially to CH4 emissions in a warmer Arctic and that CH4 emission predictions may be improved by accounting for spatial variations in sediment microbiota.


Subject(s)
Methane/analysis , Arctic Regions , Geologic Sediments/analysis , Lakes , Temperature
16.
Nat Microbiol ; 6(7): 946-959, 2021 07.
Article in English | MEDLINE | ID: mdl-34155373

ABSTRACT

The accrual of genomic data from both cultured and uncultured microorganisms provides new opportunities to develop systematic taxonomies based on evolutionary relationships. Previously, we established a bacterial taxonomy through the Genome Taxonomy Database. Here, we propose a standardized archaeal taxonomy that is derived from a 122-concatenated-protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence. The resulting archaeal taxonomy, which forms part of the Genome Taxonomy Database, is stable for a range of phylogenetic variables including marker gene selection, inference methods, corrections for rate heterogeneity and compositional bias, tree rooting scenarios and expansion of the genome database. Rank normalization is shown to robustly correct for substitution rates varying up to 30-fold using simulated datasets. Taxonomic curation follows the rules of the International Code of Nomenclature of Prokaryotes while taking into account proposals to formally recognize the rank of phylum and to use genome sequences as type material. This taxonomy is based on 2,392 archaeal genomes, 93.3% of which required one or more changes to their existing taxonomy, mainly owing to incomplete classification. We identify 16 archaeal phyla and reclassify 3 major monophyletic units from the former Euryarchaeota and one phylum that unites the Thaumarchaeota-Aigarchaeota-Crenarchaeota-Korarchaeota (TACK) superphylum into a single phylum.


Subject(s)
Archaea/classification , Databases, Genetic , Genome, Archaeal , Archaea/genetics , Databases, Genetic/standards , Evolution, Molecular , Genomics , Phylogeny , Reference Standards
17.
Front Microbiol ; 12: 643682, 2021.
Article in English | MEDLINE | ID: mdl-33959106

ABSTRACT

A fundamental goal of microbial ecology is to accurately determine the species composition in a given microbial ecosystem. In the context of the human microbiome, this is important for establishing links between microbial species and disease states. Here we benchmark the Microba Community Profiler (MCP) against other metagenomic classifiers using 140 moderate to complex in silico microbial communities and a standardized reference genome database. MCP generated accurate relative abundance estimates and made substantially fewer false positive predictions than other classifiers while retaining a high recall rate. We further demonstrated that the accuracy of species classification was substantially increased using the Microba Genome Database, which is more comprehensive than reference datasets used by other classifiers and illustrates the importance of including genomes of uncultured taxa in reference databases. Consequently, MCP classifies appreciably more reads than other classifiers when using their recommended reference databases. These results establish MCP as best-in-class with the ability to produce comprehensive and accurate species profiles of human gastrointestinal samples.

18.
ISME J ; 15(7): 1879-1892, 2021 07.
Article in English | MEDLINE | ID: mdl-33824426

ABSTRACT

The classification of life forms into a hierarchical system (taxonomy) and the application of names to this hierarchy (nomenclature) is at a turning point in microbiology. The unprecedented availability of genome sequences means that a taxonomy can be built upon a comprehensive evolutionary framework, a longstanding goal of taxonomists. However, there is resistance to adopting a single framework to preserve taxonomic freedom, and ever increasing numbers of genomes derived from uncultured prokaryotes threaten to overwhelm current nomenclatural practices, which are based on characterised isolates. The challenge ahead then is to reach a consensus on the taxonomic framework and to adapt and scale the existing nomenclatural code, or create a new code, to systematically incorporate uncultured taxa into the chosen framework.


Subject(s)
Genome , Prokaryotic Cells , Biological Evolution
19.
ISME Commun ; 1(1): 14, 2021 May 05.
Article in English | MEDLINE | ID: mdl-37938632

ABSTRACT

The ability to preserve microbial communities in faecal samples is essential as increasing numbers of studies seek to use the gut microbiome to identify biomarkers of disease. Here we use shotgun metagenomics to rigorously evaluate the technical and compositional reproducibility of five room temperature (RT) microbial stabilisation methods compared to the best practice of flash-freezing. These methods included RNALater, OMNIGene-GUT, a dry BBL swab, LifeGuard, and a novel method for preserving faecal samples, a Copan FLOQSwab in an active drying tube (FLOQSwab-ADT). Each method was assessed using six replicate faecal samples from five participants, totalling 180 samples. The FLOQSwab-ADT performed best for both technical and compositional reproducibility, followed by RNAlater and OMNIgene-GUT. LifeGuard and the BBL swab had unpredictable outgrowth of Escherichia species in at least one replicate from each participant. We further evaluated the FLOQSwab-ADT in an additional 239 samples across 10 individuals after storage at -20 °C, RT, and 50 °C for four weeks compared to fresh controls. The FLOQSwab-ADT maintained its performance across all temperatures, indicating this method is an excellent alternative to existing RT stabilisation methods.

20.
Nat Biotechnol ; 39(1): 105-114, 2021 01.
Article in English | MEDLINE | ID: mdl-32690973

ABSTRACT

Comprehensive, high-quality reference genomes are required for functional characterization and taxonomic assignment of the human gut microbiota. We present the Unified Human Gastrointestinal Genome (UHGG) collection, comprising 204,938 nonredundant genomes from 4,644 gut prokaryotes. These genomes encode >170 million protein sequences, which we collated in the Unified Human Gastrointestinal Protein (UHGP) catalog. The UHGP more than doubles the number of gut proteins in comparison to those present in the Integrated Gene Catalog. More than 70% of the UHGG species lack cultured representatives, and 40% of the UHGP lack functional annotations. Intraspecies genomic variation analyses revealed a large reservoir of accessory genes and single-nucleotide variants, many of which are specific to individual human populations. The UHGG and UHGP collections will enable studies linking genotypes to phenotypes in the human gut microbiome.


Subject(s)
Databases, Genetic , Gastrointestinal Microbiome/genetics , Genome, Bacterial/genetics , Metagenome/genetics , Bacteria/classification , Bacteria/genetics , Humans , Metagenomics , Phenotype , Phylogeny
SELECTION OF CITATIONS
SEARCH DETAIL
...