Pesquisa | Portal Regional da BVS (teste)

Revealing proteome-level functional redundancy in the human gut microbiome using ultra-deep metaproteomics.

Li, Leyuan; Wang, Tong; Ning, Zhibin; Zhang, Xu; Butcher, James; Serrana, Joeselle M; Simopoulos, Caitlin M A; Mayne, Janice; Stintzi, Alain; Mack, David R; Liu, Yang-Yu; Figeys, Daniel.

Nat Commun ; 14(1): 3428, 2023 06 10.

Artigo em Inglês | MEDLINE | ID: mdl-37301875

RESUMO

Functional redundancy is a key ecosystem property representing the fact that different taxa contribute to an ecosystem in similar ways through the expression of redundant functions. The redundancy of potential functions (or genome-level functional redundancy [Formula: see text]) of human microbiomes has been recently quantified using metagenomics data. Yet, the redundancy of expressed functions in the human microbiome has never been quantitatively explored. Here, we present an approach to quantify the proteome-level functional redundancy [Formula: see text] in the human gut microbiome using metaproteomics. Ultra-deep metaproteomics reveals high proteome-level functional redundancy and high nestedness in the human gut proteomic content networks (i.e., the bipartite graphs connecting taxa to functions). We find that the nested topology of proteomic content networks and relatively small functional distances between proteomes of certain pairs of taxa together contribute to high [Formula: see text] in the human gut microbiome. As a metric comprehensively incorporating the factors of presence/absence of each function, protein abundances of each function and biomass of each taxon, [Formula: see text] outcompetes diversity indices in detecting significant microbiome responses to environmental factors, including individuality, biogeography, xenobiotics, and disease. We show that gut inflammation and exposure to specific xenobiotics can significantly diminish the [Formula: see text] with no significant change in taxonomic diversity.

Assuntos

Microbioma Gastrointestinal , Microbiota , Humanos , Microbioma Gastrointestinal/fisiologia , Proteoma , Proteômica , Xenobióticos , Fezes

MetaProClust-MS1: an MS1 Profiling Approach for Large-Scale Microbiome Screening.

Simopoulos, Caitlin M A; Ning, Zhibin; Li, Leyuan; Khamis, Mona M; Zhang, Xu; Lavallée-Adam, Mathieu; Figeys, Daniel.

mSystems ; 7(4): e0038122, 2022 08 30.

Artigo em Inglês | MEDLINE | ID: mdl-35950762

RESUMO

Metaproteomics is used to explore the functional dynamics of microbial communities. However, acquiring metaproteomic data by tandem mass spectrometry (MS/MS) is time-consuming and resource-intensive, and there is a demand for computational methods that can be used to reduce these resource requirements. We present MetaProClust-MS1, a computational framework for microbiome feature screening developed to prioritize samples for follow-up MS/MS. In this proof-of-concept study, we tested and compared MetaProClust-MS1 results on gut microbiome data, from fecal samples, acquired using short 15-min MS1-only chromatographic gradients and MS1 spectra from longer 60-min gradients to MS/MS-acquired data. We found that MetaProClust-MS1 identified robust gut microbiome responses caused by xenobiotics with significantly correlated cluster topologies of comparable data sets. We also used MetaProClust-MS1 to reanalyze data from both a clinical MS/MS diagnostic study of pediatric patients with inflammatory bowel disease and an experiment evaluating the therapeutic effects of a small molecule on the brain tissue of Alzheimer's disease mouse models. MetaProClust-MS1 clusters could distinguish between inflammatory bowel disease diagnoses (ulcerative colitis and Crohn's disease) using samples from mucosal luminal interface samples and identified hippocampal proteome shifts of Alzheimer's disease mouse models after small-molecule treatment. Therefore, we demonstrate that MetaProClust-MS1 can screen both microbiomes and single-species proteomes using only MS1 profiles, and our results suggest that this approach may be generalizable to any proteomics experiment. MetaProClust-MS1 may be especially useful for large-scale metaproteomic screening for the prioritization of samples for further metaproteomic characterization, using MS/MS, for instance, in addition to being a promising novel approach for clinical diagnostic screening. IMPORTANCE Growing evidence suggests that human gut microbiome composition and function are highly associated with health and disease. As such, high-throughput metaproteomic studies are becoming more common in gut microbiome research. However, using a conventional long liquid chromatography (LC)-MS/MS gradient metaproteomics approach as an initial screen in large-scale microbiome experiments can be slow and expensive. To combat this challenge, we introduce MetaProClust-MS1, a computational framework for microbiome screening using MS1-only profiles. In this proof-of-concept study, we show that MetaProClust-MS1 identifies clusters of gut microbiome treatments using MS1-only profiles similar to those identified using MS/MS. Our approach allows researchers to prioritize samples and treatments of interest for further metaproteomic analyses and may be generally applicable to any proteomic analysis. In particular, this approach may be especially useful for large-scale metaproteomic screening or in clinical settings where rapid diagnostic evidence is required.

Assuntos

Doença de Alzheimer , Doenças Inflamatórias Intestinais , Microbiota , Animais , Camundongos , Humanos , Criança , Proteômica/métodos , Espectrometria de Massas em Tandem , Proteoma

Novel Bioinformatics Strategies Driving Dynamic Metaproteomic Studies.

Simopoulos, Caitlin M A; Figeys, Daniel; Lavallée-Adam, Mathieu.

Methods Mol Biol ; 2456: 319-338, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35612752

RESUMO

Constant improvements in mass spectrometry technologies and laboratory workflows have enabled the proteomics investigation of biological samples of growing complexity. Microbiomes represent such complex samples for which metaproteomics analyses are becoming increasingly popular. Metaproteomics experimental procedures create large amounts of data from which biologically relevant signal must be efficiently extracted to draw meaningful conclusions. Such a data processing requires appropriate bioinformatics tools specifically developed for, or capable of handling metaproteomics data. In this chapter, we outline current and novel tools that can perform the most commonly used steps in the analysis of cutting-edge metaproteomics data, such as peptide and protein identification and quantification, as well as data normalization, imputation, mining, and visualization. We also provide details about the experimental setups in which these tools should be used.

Assuntos

Microbioma Gastrointestinal , Microbiota , Biologia Computacional/métodos , Proteômica/métodos , Software

iMetaLab Suite: A one-stop toolset for metaproteomics.

Li, Leyuan; Ning, Zhibin; Cheng, Kai; Zhang, Xu; Simopoulos, Caitlin M A; Figeys, Daniel.

Imeta ; 1(2): e25, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38868572

RESUMO

Metaproteomics is a recently thriving technique that studies the collection of proteins in complex microbiomes of the human, animal, plant, and environment. The bioinformatics workflow required for metaproteomics research, from the database search and protein quantification to downstream functional and taxonomic analysis has been challenging and thus limiting the accessibility of metaproteomics to microbiome researchers. To overcome these challenges, we have developed a set of tools named iMetaLab Suite. iMetaLab Suite includes the following components: (1) MetaLab Desktop, an automated database search software that facilities proteins identification and quantitation from microbiomes; (2) the automated iMetaReport that allows users to quickly access database search results and data set profiles; and (3) an interactive online toolset, iMetaShiny, covering most frequently used functional, taxonomic, and statistical analysis in metaproteomics. iMetaLab Suite is a free, easily accessible, and actively updated toolset available to assist researchers to explore metaproteomic data.

Coding and long non-coding RNAs provide evidence of distinct transcriptional reprogramming for two ecotypes of the extremophile plant Eutrema salsugineum undergoing water deficit stress.

Simopoulos, Caitlin M A; MacLeod, Mitchell J R; Irani, Solmaz; Sung, Wilson W L; Champigny, Marc J; Summers, Peter S; Golding, G Brian; Weretilnyk, Elizabeth A.

BMC Genomics ; 21(1): 396, 2020 Jun 08.

Artigo em Inglês | MEDLINE | ID: mdl-32513102

RESUMO

BACKGROUND: The severity and frequency of drought has increased around the globe, creating challenges in ensuring food security for a growing world population. As a consequence, improving water use efficiency by crops has become an important objective for crop improvement. Some wild crop relatives have adapted to extreme osmotic stresses and can provide valuable insights into traits and genetic signatures that can guide efforts to improve crop tolerance to water deficits. Eutrema salsugineum, a close relative of many cruciferous crops, is a halophytic plant and extremophyte model for abiotic stress research. RESULTS: Using comparative transcriptomics, we show that two E. salsugineum ecotypes display significantly different transcriptional responses towards a two-stage drought treatment. Even before visibly wilting, water deficit led to the differential expression of almost 1,100 genes for an ecotype from the semi-arid, sub-arctic Yukon, Canada, but only 63 genes for an ecotype from the semi-tropical, monsoonal, Shandong, China. After recovery and a second drought treatment, about 5,000 differentially expressed genes were detected in Shandong plants versus 1,900 genes in Yukon plants. Only 13 genes displayed similar drought-responsive patterns for both ecotypes. We detected 1,007 long non-protein coding RNAs (lncRNAs), 8% were only expressed in stress-treated plants, a surprising outcome given the documented association between lncRNA expression and stress. Co-expression network analysis of the transcriptomes identified eight gene clusters where at least half of the genes in each cluster were differentially expressed. While many gene clusters were correlated to drought treatments, only a single cluster significantly correlated to drought exposure in both ecotypes. CONCLUSION: Extensive, ecotype-specific transcriptional reprogramming with drought was unexpected given that both ecotypes are adapted to saline habitats providing persistent exposure to osmotic stress. This ecotype-specific response would have escaped notice had we used a single exposure to water deficit. Finally, the apparent capacity to improve tolerance and growth after a drought episode represents an important adaptive trait for a plant that thrives under semi-arid Yukon conditions, and may be similarly advantageous for crop species experiencing stresses attributed to climate change.

Assuntos

Brassicaceae/crescimento & desenvolvimento , Perfilação da Expressão Gênica/métodos , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Brassicaceae/genética , Canadá , Desidratação , Ecótipo , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Folhas de Planta/genética , Folhas de Planta/crescimento & desenvolvimento , RNA de Plantas/genética , Plantas Tolerantes a Sal/genética , Plantas Tolerantes a Sal/crescimento & desenvolvimento , Análise de Sequência de RNA , Estresse Fisiológico

pepFunk: a tool for peptide-centric functional analysis of metaproteomic human gut microbiome studies.

Simopoulos, Caitlin M A; Ning, Zhibin; Zhang, Xu; Li, Leyuan; Walker, Krystal; Lavallée-Adam, Mathieu; Figeys, Daniel.

Bioinformatics ; 36(14): 4171-4179, 2020 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-32369596

RESUMO

MOTIVATION: Enzymatic digestion of proteins before mass spectrometry analysis is a key process in metaproteomic workflows. Canonical metaproteomic data processing pipelines typically involve matching spectra produced by the mass spectrometer to a theoretical spectra database, followed by matching the identified peptides back to parent-proteins. However, the nature of enzymatic digestion produces peptides that can be found in multiple proteins due to conservation or chance, presenting difficulties with protein and functional assignment. RESULTS: To combat this challenge, we developed pepFunk, a peptide-centric metaproteomic workflow focused on the analysis of human gut microbiome samples. Our workflow includes a curated peptide database annotated with Kyoto Encyclopedia of Genes and Genomes (KEGG) terms and a gene set variation analysis-inspired pathway enrichment adapted for peptide-level data. Analysis using our peptide-centric workflow is fast and highly correlated to a protein-centric analysis, and can identify more enriched KEGG pathways than analysis using protein-level data. Our workflow is open source and available as a web application or source code to be run locally. AVAILABILITY AND IMPLEMENTATION: pepFunk is available online as a web application at https://shiny.imetalab.ca/pepFunk/ with open-source code available from https://github.com/northomics/pepFunk. CONTACT: dfigeys@uottawa.ca. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Microbioma Gastrointestinal , Biologia Computacional , Humanos , Peptídeos , Proteínas , Software

Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships.

Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian.

G3 (Bethesda) ; 9(8): 2511-2520, 2019 08 08.

Artigo em Inglês | MEDLINE | ID: mdl-31235560

RESUMO

Long non-coding RNAs (lncRNAs) represent a diverse class of regulatory loci with roles in development and stress responses throughout all kingdoms of life. LncRNAs, however, remain under-studied in plants compared to animal systems. To address this deficiency, we applied a machine learning prediction tool, Classifying RNA by Ensemble Machine learning Algorithm (CREMA), to analyze RNAseq data from 11 plant species chosen to represent a wide range of evolutionary histories. Transcript sequences of all expressed and/or annotated loci from plants grown in unstressed (control) conditions were assembled and input into CREMA for comparative analyses. On average, 6.4% of the plant transcripts were identified by CREMA as encoding lncRNAs. Gene annotation associated with the transcripts showed that up to 99% of all predicted lncRNAs for Solanum tuberosum and Amborella trichopoda were missing from their reference annotations whereas the reference annotation for the genetic model plant Arabidopsis thaliana contains 96% of all predicted lncRNAs for this species. Thus a reliance on reference annotations for use in lncRNA research in less well-studied plants can be impeded by the near absence of annotations associated with these regulatory transcripts. Moreover, our work using phylogenetic signal analyses suggests that molecular traits of plant lncRNAs display different evolutionary patterns than all other transcripts in plants and have molecular traits that do not follow a classic evolutionary pattern. Specifically, GC content was the only tested trait of lncRNAs with consistently significant and high phylogenetic signal, contrary to high signal in all tested molecular traits for the other transcripts in our tested plant species.

Assuntos

Filogenia , Plantas/classificação , Plantas/genética , Característica Quantitativa Herdável , RNA Longo não Codificante/genética , RNA de Plantas , Evolução Biológica , Sequenciamento de Nucleotídeos em Larga Escala

Prediction of plant lncRNA by ensemble machine learning classifiers.

Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian.

BMC Genomics ; 19(1): 316, 2018 May 02.

Artigo em Inglês | MEDLINE | ID: mdl-29720103

RESUMO

BACKGROUND: In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. RESULTS: Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. CONCLUSIONS: This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.

Assuntos

Biologia Computacional/métodos , Aprendizado de Máquina , RNA Longo não Codificante/genética , Fases de Leitura Aberta/genética , RNA de Plantas/genética , Processos Estocásticos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA