Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Bioinformatics ; 24(1): 380, 2023 Oct 09.
Article in English | MEDLINE | ID: mdl-37807043

ABSTRACT

BACKGROUND: By creating networks of biochemical pathways, communities of micro-organisms are able to modulate the properties of their environment and even the metabolic processes within their hosts. Next-generation high-throughput sequencing has led to a new frontier in microbial ecology, promising the ability to leverage the microbiome to make crucial advancements in the environmental and biomedical sciences. However, this is challenging, as genomic data are high-dimensional, sparse, and noisy. Much of this noise reflects the exact conditions under which sequencing took place, and is so significant that it limits consensus-based validation of study results. RESULTS: We propose an ensemble approach for cross-study exploratory analyses of microbial abundance data in which we first estimate the variance-covariance matrix of the underlying abundances from each dataset on the log scale assuming Poisson sampling, and subsequently model these covariances jointly so as to find a shared low-dimensional subspace of the feature space. CONCLUSIONS: By viewing the projection of the latent true abundances onto this common structure, the variation is pared down to that which is shared among all datasets, and is likely to reflect more generalizable biological signal than can be inferred from individual datasets. We investigate several ways of achieving this, demonstrate that they work well on simulated and real metagenomic data in terms of signal retention and interpretability, and recommend a particular implementation.


Subject(s)
Metagenome , Microbiota , Microbiota/genetics , Metagenomics/methods , Genomics , Computational Biology/methods
2.
Bioinformatics ; 38(22): 5055-5063, 2022 11 15.
Article in English | MEDLINE | ID: mdl-36179077

ABSTRACT

MOTIVATION: Microbiome functional data are frequently analyzed to identify associations between microbial functions (e.g. genes) and sample groups of interest. However, it is challenging to distinguish between different possible explanations for variation in community-wide functional profiles by considering functions alone. To help address this problem, we have developed POMS, a package that implements multiple phylogeny-aware frameworks to more robustly identify enriched functions. RESULTS: The key contribution is an extended balance-tree workflow that incorporates functional and taxonomic information to identify functions that are consistently enriched in sample groups across independent taxonomic lineages. Our package also includes a workflow for running phylogenetic regression. Based on simulated data we demonstrate that these approaches more accurately identify gene families that confer a selective advantage compared with commonly used tools. We also show that POMS in particular can identify enriched functions in real-world metagenomics datasets that are potential targets of strong selection on multiple members of the microbiome. AVAILABILITY AND IMPLEMENTATION: These workflows are freely available in the POMS R package at https://github.com/gavinmdouglas/POMS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Microbiota , Phylogeny , Microbiota/genetics , Metagenomics , Software
4.
Nat Commun ; 13(1): 342, 2022 01 17.
Article in English | MEDLINE | ID: mdl-35039521

ABSTRACT

Identifying differentially abundant microbes is a common goal of microbiome studies. Multiple methods are used interchangeably for this purpose in the literature. Yet, there are few large-scale studies systematically exploring the appropriateness of using these tools interchangeably, and the scale and significance of the differences between them. Here, we compare the performance of 14 differential abundance testing methods on 38 16S rRNA gene datasets with two sample groups. We test for differences in amplicon sequence variants and operational taxonomic units (ASVs) between these groups. Our findings confirm that these tools identified drastically different numbers and sets of significant ASVs, and that results depend on data pre-processing. For many tools the number of features identified correlate with aspects of the data, such as sample size, sequencing depth, and effect size of community differences. ALDEx2 and ANCOM-II produce the most consistent results across studies and agree best with the intersect of results from different approaches. Nevertheless, we recommend that researchers should use a consensus approach based on multiple differential abundance methods to help ensure robust biological interpretations.


Subject(s)
Databases, Genetic , Microbiota/genetics , Cluster Analysis , Computer Simulation , Diarrhea/genetics , Genetic Variation , Humans , Phylogeny , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...