Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.438
Filtrar
1.
Nat Commun ; 15(1): 5734, 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38977664

RESUMO

Metagenomic sequencing has provided great advantages in the characterisation of microbiomes, but currently available analysis tools lack the ability to combine subspecies-level taxonomic resolution and accurate abundance estimation with functional profiling of assembled genomes. To define the microbiome and its associations with human health, improved tools are needed to enable comprehensive understanding of the microbial composition and elucidation of the phylogenetic and functional relationships between the microbes. Here, we present MAGinator, a freely available tool, tailored for profiling of shotgun metagenomics datasets. MAGinator provides de novo identification of subspecies-level microbes and accurate abundance estimates of metagenome-assembled genomes (MAGs). MAGinator utilises the information from both gene- and contig-based methods yielding insight into both taxonomic profiles and the origin of genes and genetic content, used for inference of functional content of each sample by host organism. Additionally, MAGinator facilitates the reconstruction of phylogenetic relationships between the MAGs, providing a framework to identify clade-level differences.


Assuntos
Metagenoma , Metagenômica , Microbiota , Filogenia , Metagenômica/métodos , Metagenoma/genética , Humanos , Microbiota/genética , Software , Bactérias/genética , Bactérias/classificação , Genoma Bacteriano/genética
2.
BMC Bioinformatics ; 25(1): 237, 2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-38997633

RESUMO

BACKGROUND: With the emergence of Oxford Nanopore technology, now the on-site sequencing of 16S rRNA from environments is available. Due to the error level and structure, the analysis of such data demands some database of reference sequences. However, many taxa from complex and diverse environments, have poor representation in publicly available databases. In this paper, we propose the METASEED pipeline for the reconstruction of full-length 16S sequences from such environments, in order to improve the reference for the subsequent use of on-site sequencing. RESULTS: We show that combining high-precision short-read sequencing of both 16S and full metagenome from the same samples allow us to reconstruct high-quality 16S sequences from the more abundant taxa. A significant novelty is the carefully designed collection of metagenome reads that matches the 16S amplicons, based on a combination of uniqueness and abundance. Compared to alternative approaches this produces superior results. CONCLUSION: Our pipeline will facilitate numerous studies associated with various unknown microorganisms, thus allowing the comprehension of the diverse environments. The pipeline is a potential tool in generating a full length 16S rRNA gene database for any environment.


Assuntos
Metagenoma , RNA Ribossômico 16S , RNA Ribossômico 16S/genética , Metagenoma/genética , Análise de Sequência de DNA/métodos , Bases de Dados Genéticas
3.
Sci Rep ; 14(1): 14720, 2024 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-38926415

RESUMO

Dental calculus is a microbial biofilm that contains biomolecules from oral commensals and pathogens, including those potentially related to cause of death (CoD). To assess the utility of calculus as a diagnostically informative substrate, in conjunction with paleopathological analysis, calculus samples from 39 individuals in the Smithsonian Institution's Robert J. Terry Collection with CoDs of either syphilis or tuberculosis were assessed via shotgun metagenomic sequencing for the presence of Treponema pallidum subsp. pallidum and Mycobacterium tuberculosis complex (MTBC) DNA. Paleopathological analysis revealed that frequencies of skeletal lesions associated with these diseases were partially inconsistent with diagnostic criteria. Although recovery of T. p. pallidum DNA from individuals with a syphilis CoD was elusive, MTBC DNA was identified in at least one individual with a tuberculosis CoD. The authenticity of MTBC DNA was confirmed using targeted quantitative PCR assays, MTBC genome enrichment, and in silico bioinformatic analyses; however, the lineage of the MTBC strain present could not be determined. Overall, our study highlights the utility of dental calculus for molecular detection of tuberculosis in the archaeological record and underscores the effect of museum preparation techniques and extensive handling on pathogen DNA preservation in skeletal collections.


Assuntos
Cálculos Dentários , Metagenômica , Mycobacterium tuberculosis , Paleopatologia , Tuberculose , Cálculos Dentários/microbiologia , Cálculos Dentários/história , Humanos , Metagenômica/métodos , Paleopatologia/métodos , Tuberculose/diagnóstico , Tuberculose/microbiologia , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/isolamento & purificação , DNA Bacteriano/genética , Masculino , Treponema pallidum/genética , Treponema pallidum/isolamento & purificação , Sífilis/diagnóstico , Sífilis/microbiologia , Sífilis/história , Feminino , Adulto , Metagenoma/genética , Pessoa de Meia-Idade
4.
Nat Commun ; 15(1): 5168, 2024 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-38886447

RESUMO

Antibiotic resistance genes (ARGs) and metal(loid) resistance genes (MRGs) coexist in organic fertilized agroecosystems based on their correlations in abundance, yet evidence for the genetic linkage of ARG-MRGs co-selected by organic fertilization remains elusive. Here, an analysis of 511 global agricultural soil metagenomes reveals that organic fertilization correlates with a threefold increase in the number of diverse types of ARG-MRG-carrying contigs (AMCCs) in the microbiome (63 types) compared to non-organic fertilized soils (22 types). Metatranscriptomic data indicates increased expression of AMCCs under higher arsenic stress, with co-regulation of the ARG-MRG pairs. Organic fertilization heightens the coexistence of ARG-MRG in genomic elements through impacting soil properties and ARG and MRG abundances. Accordingly, a comprehensive global map was constructed to delineate the distribution of coexistent ARG-MRGs with virulence factors and mobile genes in metagenome-assembled genomes from agricultural lands. The map unveils a heightened relative abundance and potential pathogenicity risks (range of 4-6) for the spread of coexistent ARG-MRGs in Central North America, Eastern Europe, Western Asia, and Northeast China compared to other regions, which acquire a risk range of 1-3. Our findings highlight that organic fertilization co-selects genetically linked ARGs and MRGs in the global soil microbiome, and underscore the need to mitigate the spread of these co-resistant genes to safeguard public health.


Assuntos
Fertilizantes , Microbiota , Microbiologia do Solo , Microbiota/genética , Microbiota/efeitos dos fármacos , Metagenoma/genética , Resistência Microbiana a Medicamentos/genética , Solo/química , Genes Bacterianos , Metais , Antibacterianos/farmacologia , Agricultura
5.
Microbiol Spectr ; 12(7): e0410823, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38832899

RESUMO

The rapid spread of antimicrobial resistance (AMR) is a threat to global health, and the nature of co-occurring antimicrobial resistance genes (ARGs) may cause collateral AMR effects once antimicrobial agents are used. Therefore, it is essential to identify which pairs of ARGs co-occur. Given the wealth of next-generation sequencing data available in public repositories, we have investigated the correlation between ARG abundances in a collection of 214,095 metagenomic data sets. Using more than 6.76∙108 read fragments aligned to acquired ARGs to infer pairwise correlation coefficients, we found that more ARGs correlated with each other in human and animal sampling origins than in soil and water environments. Furthermore, we argued that the correlations could serve as risk profiles of resistance co-occurring to critically important antimicrobials (CIAs). Using these profiles, we found evidence of several ARGs conferring resistance for CIAs being co-abundant, such as tetracycline ARGs correlating with most other forms of resistance. In conclusion, this study highlights the important ARG players indirectly involved in shaping the resistomes of various environments that can serve as monitoring targets in AMR surveillance programs. IMPORTANCE: Understanding the collateral effects happening in a resistome can reveal previously unknown links between antimicrobial resistance genes (ARGs). Through the analysis of pairwise ARG abundances in 214K metagenomic samples, we observed that the co-abundance is highly dependent on the environmental context and argue that these correlations can be used to show the risk of co-selection occurring in different settings.


Assuntos
Antibacterianos , Bactérias , Farmacorresistência Bacteriana , Metagenômica , Humanos , Antibacterianos/farmacologia , Bactérias/genética , Bactérias/efeitos dos fármacos , Bactérias/classificação , Farmacorresistência Bacteriana/genética , Animais , Genes Bacterianos/genética , Microbiologia do Solo , Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma/genética
6.
Nat Commun ; 15(1): 4858, 2024 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-38871712

RESUMO

Serpentinization, a geochemical process found on modern and ancient Earth, provides an ultra-reducing environment that can support microbial methanogenesis and acetogenesis. Several groups of archaea, such as the order Methanocellales, are characterized by their ability to produce methane. Here, we generate metagenomic sequences from serpentinized springs in The Cedars, California, and construct a circularized metagenome-assembled genome of a Methanocellales archaeon, termed Met12, that lacks essential methanogenesis genes. The genome includes genes for an acetyl-CoA pathway, but lacks genes encoding methanogenesis enzymes such as methyl-coenzyme M reductase, heterodisulfide reductases and hydrogenases. In situ transcriptomic analyses reveal high expression of a multi-heme c-type cytochrome, and heterologous expression of this protein in a model bacterium demonstrates that it is capable of accepting electrons. Our results suggest that Met12, within the order Methanocellales, is not a methanogen but a CO2-reducing, electron-fueled acetogen without electron bifurcation.


Assuntos
Metano , Metano/metabolismo , Genoma Arqueal , Proteínas Arqueais/metabolismo , Proteínas Arqueais/genética , Oxirredutases/genética , Oxirredutases/metabolismo , Metagenoma/genética , Filogenia , Acetilcoenzima A/metabolismo , Dióxido de Carbono/metabolismo , Metagenômica
7.
Nat Commun ; 15(1): 4631, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38821971

RESUMO

Although long-read sequencing enables the generation of complete genomes for unculturable microbes, its high cost limits the widespread adoption of long-read sequencing in large-scale metagenomic studies. An alternative method is to assemble short-reads with long-range connectivity, which can be a cost-effective way to generate high-quality microbial genomes. Here, we develop Pangaea, a bioinformatic approach designed to enhance metagenome assembly using short-reads with long-range connectivity. Pangaea leverages connectivity derived from physical barcodes of linked-reads or virtual barcodes by aligning short-reads to long-reads. Pangaea utilizes a deep learning-based read binning algorithm to assemble co-barcoded reads exhibiting similar sequence contexts and abundances, thereby improving the assembly of high- and medium-abundance microbial genomes. Pangaea also leverages a multi-thresholding algorithm strategy to refine assembly for low-abundance microbes. We benchmark Pangaea on linked-reads and a combination of short- and long-reads from simulation data, mock communities and human gut metagenomes. Pangaea achieves significantly higher contig continuity as well as more near-complete metagenome-assembled genomes (NCMAGs) than the existing assemblers. Pangaea also generates three complete and circular NCMAGs on the human gut microbiomes.


Assuntos
Algoritmos , Microbioma Gastrointestinal , Genoma Microbiano , Metagenoma , Metagenômica , Humanos , Metagenoma/genética , Metagenômica/métodos , Microbioma Gastrointestinal/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Aprendizado Profundo , Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Genoma Bacteriano
8.
Genes Genomics ; 46(6): 701-712, 2024 06.
Artigo em Inglês | MEDLINE | ID: mdl-38700829

RESUMO

BACKGROUND: The importance of the human microbiome in the analysis of various diseases is emerging. The two main methods used to profile the human microbiome are 16S rRNA gene sequencing (16S sequencing) and whole-genome shotgun sequencing (WGS). Owing to the full coverage of the genome in sequencing, WGS has multiple advantages over 16S sequencing, including higher taxonomic profiling resolution at the species-level and functional profiling analysis. However, 16S sequencing remains widely used because of its relatively low cost. Although WGS is the standard method for obtaining accurate species-level data, we found that 16S sequencing data contained rich information to predict high-resolution species-level abundances with reasonable accuracy. OBJECTIVE: In this study, we proposed MicroPredict, a method for accurately predicting WGS-comparable species-level abundance data using 16S taxonomic profile data. METHODS: We employed a mixed model using two key strategies: (1) modeling both sample- and species-specific information for predicting WGS abundances, and (2) accounting for the possible correlations among different species. RESULTS: We found that MicroPredict outperformed the other machine learning methods. CONCLUSION: We expect that our approach will help researchers accurately approximate the species-level abundances of microbiome profiles in datasets for which only cost-effective 16S sequencing has been applied.


Assuntos
Metagenômica , Microbiota , RNA Ribossômico 16S , RNA Ribossômico 16S/genética , Metagenômica/métodos , Humanos , Microbiota/genética , Aprendizado de Máquina , Sequenciamento Completo do Genoma/métodos , Metagenoma/genética , Bactérias/genética , Bactérias/classificação
9.
Nat Commun ; 15(1): 3699, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38698035

RESUMO

In silico identification of viral anti-CRISPR proteins (Acrs) has relied largely on the guilt-by-association method using known Acrs or anti-CRISPR associated proteins (Acas) as the bait. However, the low number and limited spread of the characterized archaeal Acrs and Aca hinders our ability to identify Acrs using guilt-by-association. Here, based on the observation that the few characterized archaeal Acrs and Aca are transcribed immediately post viral infection, we hypothesize that these genes, and many other unidentified anti-defense genes (ADG), are under the control of conserved regulatory sequences including a strong promoter, which can be used to predict anti-defense genes in archaeal viruses. Using this consensus sequence based method, we identify 354 potential ADGs in 57 archaeal viruses and 6 metagenome-assembled genomes. Experimental validation identified a CRISPR subtype I-A inhibitor and the first virally encoded inhibitor of an archaeal toxin-antitoxin based immune system. We also identify regulatory proteins potentially akin to Acas that can facilitate further identification of ADGs combined with the guilt-by-association approach. These results demonstrate the potential of regulatory sequence analysis for extensive identification of ADGs in viruses of archaea and bacteria.


Assuntos
Archaea , Vírus de Archaea , Vírus de Archaea/genética , Archaea/genética , Archaea/virologia , Archaea/imunologia , Regiões Promotoras Genéticas/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Proteínas Virais/genética , Proteínas Arqueais/genética , Proteínas Arqueais/metabolismo , Metagenoma/genética , Proteínas Associadas a CRISPR/genética , Proteínas Associadas a CRISPR/metabolismo , Sistemas CRISPR-Cas/genética
10.
Nat Commun ; 15(1): 4089, 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38744831

RESUMO

Dominant microorganisms of the Sargasso Sea are key drivers of the global carbon cycle. However, associated viruses that shape microbial community structure and function are not well characterised. Here, we combined short and long read sequencing to survey Sargasso Sea phage communities in virus- and cellular fractions at viral maximum (80 m) and mesopelagic (200 m) depths. We identified 2,301 Sargasso Sea phage populations from 186 genera. Over half of the phage populations identified here lacked representation in global ocean viral metagenomes, whilst 177 of the 186 identified genera lacked representation in genomic databases of phage isolates. Viral fraction and cell-associated viral communities were decoupled, indicating viral turnover occurred across periods longer than the sampling period of three days. Inclusion of long-read data was critical for capturing the breadth of viral diversity. Phage isolates that infect the dominant bacterial taxa Prochlorococcus and Pelagibacter, usually regarded as cosmopolitan and abundant, were poorly represented.


Assuntos
Bacteriófagos , Metagenoma , Metagenômica , Oceanos e Mares , Água do Mar , Metagenômica/métodos , Bacteriófagos/genética , Bacteriófagos/isolamento & purificação , Bacteriófagos/classificação , Água do Mar/virologia , Água do Mar/microbiologia , Metagenoma/genética , Genoma Viral/genética , Filogenia , Prochlorococcus/virologia , Prochlorococcus/genética , Microbiota/genética , Bactérias/genética , Bactérias/virologia , Bactérias/classificação , Bactérias/isolamento & purificação
11.
BMC Bioinformatics ; 25(1): 191, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38750423

RESUMO

BACKGROUND: The application of reduced metagenomic sequencing approaches holds promise as a middle ground between targeted amplicon sequencing and whole metagenome sequencing approaches but has not been widely adopted as a technique. A major barrier to adoption is the lack of read simulation software built to handle characteristic features of these novel approaches. Reduced metagenomic sequencing (RMS) produces unique patterns of fragmentation per genome that are sensitive to restriction enzyme choice, and the non-uniform size selection of these fragments may introduce novel challenges to taxonomic assignment as well as relative abundance estimates. RESULTS: Through the development and application of simulation software, readsynth, we compare simulated metagenomic sequencing libraries with existing RMS data to assess the influence of multiple library preparation and sequencing steps on downstream analytical results. Based on read depth per position, readsynth achieved 0.79 Pearson's correlation and 0.94 Spearman's correlation to these benchmarks. Application of a novel estimation approach, fixed length taxonomic ratios, improved quantification accuracy of simulated human gut microbial communities when compared to estimates of mean or median coverage. CONCLUSIONS: We investigate the possible strengths and weaknesses of applying the RMS technique to profiling microbial communities via simulations with readsynth. The choice of restriction enzymes and size selection steps in library prep are non-trivial decisions that bias downstream profiling and quantification. The simulations investigated in this study illustrate the possible limits of preparing metagenomic libraries with a reduced representation sequencing approach, but also allow for the development of strategies for producing and handling the sequence data produced by this promising application.


Assuntos
Metagenoma , Metagenômica , Software , Metagenoma/genética , Metagenômica/métodos , Humanos , Análise de Sequência de DNA/métodos , Microbioma Gastrointestinal/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
12.
Methods Mol Biol ; 2802: 135-163, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38819559

RESUMO

Metagenome-assembled genomes, or MAGs, are genomes retrieved from metagenome datasets. In the vast majority of cases, MAGs are genomes from prokaryotic species that have not been isolated or cultivated in the lab. They, therefore, provide us with information on these species that are impossible to obtain otherwise, at least until new cultivation methods are devised. Thanks to improvements and cost reductions of DNA sequencing technologies and growing interest in microbial ecology, the rise in number of MAGs in genome repositories has been exponential. This chapter covers the basics of MAG retrieval and processing and provides a practical step-by-step guide using a real dataset and state-of-the-art tools for MAG analysis and comparison.


Assuntos
Metagenoma , Metagenômica , Metagenoma/genética , Metagenômica/métodos , Software , Biologia Computacional/métodos , Bases de Dados Genéticas , Análise de Sequência de DNA/métodos , Genoma Bacteriano
13.
Methods Mol Biol ; 2802: 587-609, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38819573

RESUMO

Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.


Assuntos
Genômica , Metagenoma , Metagenômica , Metagenômica/métodos , Metagenômica/normas , Genômica/métodos , Genômica/normas , Metagenoma/genética , Bases de Dados Genéticas , Microbiologia do Solo
14.
Int J Mol Sci ; 25(10)2024 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-38791129

RESUMO

Next-generation sequencing has transformed the acquisition of vast amounts of genomic information, including the rapid identification of target gene sequences in metagenomic databases. However, dominant species can sometimes hinder the detection of rare bacterial species. Therefore, a highly sensitive amplification technique that can selectively amplify bacterial genomes containing target genes of interest was developed in this study. The rolling circle amplification (RCA) method can initiate amplification from a single locus using a specific single primer to amplify a specific whole genome. A mixed cell suspension was prepared using Pseudomonas fluorescens ATCC17400 (targeting nonribosomal peptide synthetase [NRPS]) and Escherichia coli (non-target), and a specific primer designed for the NRPS was used for the RCA reaction. The resulting RCA product (RCP) amplified only the Pseudomonas genome. The NRPS was successfully amplified using RCP as a template from even five cells, indicating that the single-priming RCA technique can specifically enrich the target genome using gene-specific primers. Ultimately, this specific genome RCA technique was applied to metagenomes extracted from sponge-associated bacteria, and NRPS sequences were successfully obtained from an unknown sponge-associated bacterium. Therefore, this method could be effective for accessing species-specific sequences of NRPS in unknown bacteria, including viable but non-culturable bacteria.


Assuntos
Genoma Bacteriano , Técnicas de Amplificação de Ácido Nucleico , Peptídeo Sintases , Peptídeo Sintases/genética , Técnicas de Amplificação de Ácido Nucleico/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Escherichia coli/genética , Pseudomonas fluorescens/genética , Análise de Sequência de DNA/métodos , Metagenoma/genética
15.
Nat Commun ; 15(1): 3543, 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38730244

RESUMO

ß-N-Acetylgalactosamine-containing glycans play essential roles in several biological processes, including cell adhesion, signal transduction, and immune responses. ß-N-Acetylgalactosaminidases hydrolyze ß-N-acetylgalactosamine linkages of various glycoconjugates. However, their biological significance remains ambiguous, primarily because only one type of enzyme, exo-ß-N-acetylgalactosaminidases that specifically act on ß-N-acetylgalactosamine residues, has been documented to date. In this study, we identify four groups distributed among all three domains of life and characterize eight ß-N-acetylgalactosaminidases and ß-N-acetylhexosaminidase through sequence-based screening of deep-sea metagenomes and subsequent searching of public protein databases. Despite low sequence similarity, the crystal structures of these enzymes demonstrate that all enzymes share a prototype structure and have diversified their substrate specificities (oligosaccharide-releasing, oligosaccharide/monosaccharide-releasing, and monosaccharide-releasing) through the accumulation of mutations and insertional amino acid sequences. The diverse ß-N-acetylgalactosaminidases reported in this study could facilitate the comprehension of their structures and functions and present evolutionary pathways for expanding their substrate specificity.


Assuntos
Acetilgalactosamina , Glicosídeo Hidrolases , Metagenoma , Metagenoma/genética , Especificidade por Substrato , Acetilgalactosamina/metabolismo , Acetilgalactosamina/química , Glicosídeo Hidrolases/metabolismo , Glicosídeo Hidrolases/genética , Glicosídeo Hidrolases/química , beta-N-Acetil-Hexosaminidases/metabolismo , beta-N-Acetil-Hexosaminidases/genética , beta-N-Acetil-Hexosaminidases/química , Filogenia , Cristalografia por Raios X , Sequência de Aminoácidos , Animais
16.
PLoS Comput Biol ; 20(5): e1011543, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38768195

RESUMO

Random forests have emerged as a promising tool in comparative metagenomics because they can predict environmental characteristics based on microbial composition in datasets where ß-diversity metrics fall short of revealing meaningful relationships between samples. Nevertheless, despite this efficacy, they lack biological insight in tandem with their predictions, potentially hindering scientific advancement. To overcome this limitation, we leverage a geometric characterization of random forests to introduce a data-driven phylogenetic ß-diversity metric, the adaptive Haar-like distance. This new metric assigns a weight to each internal node (i.e., split or bifurcation) of a reference phylogeny, indicating the relative importance of that node in discerning environmental samples based on their microbial composition. Alongside this, a weighted nearest-neighbors classifier, constructed using the adaptive metric, can be used as a proxy for the random forest while maintaining accuracy on par with that of the original forest and another state-of-the-art classifier, CoDaCoRe. As shown in datasets from diverse microbial environments, however, the new metric and classifier significantly enhance the biological interpretability and visualization of high-dimensional metagenomic samples.


Assuntos
Algoritmos , Biologia Computacional , Metagenômica , Filogenia , Metagenômica/métodos , Biologia Computacional/métodos , Microbiota/genética , Aprendizado de Máquina , Metagenoma/genética
17.
Nat Commun ; 15(1): 2827, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565528

RESUMO

Phosphorus (P) limitation of ecosystem processes is widespread in terrestrial habitats. While a few auxiliary metabolic genes (AMGs) in bacteriophages from aquatic habitats are reported to have the potential to enhance P-acquisition ability of their hosts, little is known about the diversity and potential ecological function of P-acquisition genes encoded by terrestrial bacteriophages. Here, we analyze 333 soil metagenomes from five terrestrial habitat types across China and identify 75 viral operational taxonomic units (vOTUs) that encode 105 P-acquisition AMGs. These AMGs span 17 distinct functional genes involved in four primary processes of microbial P-acquisition. Among them, over 60% (11/17) have not been reported previously. We experimentally verify in-vitro enzymatic activities of two pyrophosphatases and one alkaline phosphatase encoded by P-acquisition vOTUs. Thirty-six percent of the 75 P-acquisition vOTUs are detectable in a published global topsoil metagenome dataset. Further analyses reveal that, under certain circumstances, the identified P-acquisition AMGs have a greater influence on soil P availability and are more dominant in soil metatranscriptomes than their corresponding bacterial genes. Overall, our results reinforce the necessity of incorporating viral contributions into biogeochemical P cycling.


Assuntos
Bacteriófagos , Bacteriófagos/genética , Ecossistema , Fósforo , Metagenoma/genética , Solo
18.
Nat Methods ; 21(6): 954-966, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38689099

RESUMO

Long-read sequencing has recently transformed metagenomics, enhancing strain-level pathogen characterization, enabling accurate and complete metagenome-assembled genomes, and improving microbiome taxonomic classification and profiling. These advancements are not only due to improvements in sequencing accuracy, but also happening across rapidly changing analysis methods. In this Review, we explore long-read sequencing's profound impact on metagenomics, focusing on computational pipelines for genome assembly, taxonomic characterization and variant detection, to summarize recent advancements in the field and provide an overview of available analytical methods to fully leverage long reads. We provide insights into the advantages and disadvantages of long reads over short reads and their evolution from the early days of long-read sequencing to their recent impact on metagenomics and clinical diagnostics. We further point out remaining challenges for the field such as the integration of methylation signals in sub-strain analysis and the lack of benchmarks.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma , Metagenômica , Microbiota , Metagenômica/métodos , Metagenoma/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Microbiota/genética , Humanos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos
19.
Nat Commun ; 15(1): 3373, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643272

RESUMO

Metagenomic analysis typically includes read-based taxonomic profiling, assembly, and binning of metagenome-assembled genomes (MAGs). Here we integrate these steps in Read Annotation Tool (RAT), which uses robust taxonomic signals from MAGs and contigs to enhance read annotation. RAT reconstructs taxonomic profiles with high precision and sensitivity, outperforming other state-of-the-art tools. In high-diversity groundwater samples, RAT annotates a large fraction of the metagenomic reads, calling novel taxa at the appropriate, sometimes high taxonomic ranks. Thus, RAT integrative profiling provides an accurate and comprehensive view of the microbiome from shotgun metagenomics data. The package of Contig Annotation Tool (CAT), Bin Annotation Tool (BAT), and RAT is available at https://github.com/MGXlab/CAT_pack (from CAT pack v6.0). The CAT pack now also supports Genome Taxonomy Database (GTDB) annotations.


Assuntos
Metagenoma , Microbiota , Metagenoma/genética , Software , Algoritmos , Microbiota/genética , Metagenômica
20.
BMC Bioinformatics ; 25(1): 161, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38649836

RESUMO

BACKGROUND: Taxonomic classification of reads obtained by metagenomic sequencing is often a first step for understanding a microbial community, but correctly assigning sequencing reads to the strain or sub-species level has remained a challenging computational problem. RESULTS: We introduce Mora, a MetagenOmic read Re-Assignment algorithm capable of assigning short and long metagenomic reads with high precision, even at the strain level. Mora is able to accurately re-assign reads by first estimating abundances through an expectation-maximization algorithm and then utilizing abundance information to re-assign query reads. The key idea behind Mora is to maximize read re-assignment qualities while simultaneously minimizing the difference from estimated abundance levels, allowing Mora to avoid over assigning reads to the same genomes. On simulated diverse reads, this allows Mora to achieve F1 scores comparable to other algorithms while having less runtime. However, Mora significantly outshines other algorithms on very similar reads. We show that the high penalty of over assigning reads to a common reference genome allows Mora to accurately infer correct strains for real data in the form of E. coli reads. CONCLUSIONS: Mora is a fast and accurate read re-assignment algorithm that is modularized, allowing it to be incorporated into general metagenomics and genomics workflows. It is freely available at https://github.com/AfZheng126/MORA .


Assuntos
Algoritmos , Metagenômica , Metagenômica/métodos , Escherichia coli/genética , Análise de Sequência de DNA/métodos , Software , Metagenoma/genética , Genoma Bacteriano
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...