Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 36(10): 3043-3048, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32108861

ABSTRACT

MOTIVATION: Many methods for microbial protein subcellular localization (SCL) prediction exist; however, none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms and in other environments (e.g. for identification of cell-surface biomarkers for rapid protein-based diagnostic tests). We wished to also identify new markers of water quality from freshwater samples collected from pristine versus pollution-impacted watersheds. RESULTS: We report PSORTm, the first bioinformatics tool designed for prediction of diverse bacterial and archaeal protein SCL from metagenomics data. PSORTm incorporates components of PSORTb, one of the most precise and widely used protein SCL predictors, with an automated classification by cell envelope. An evaluation using 5-fold cross-validation with in silico-fragmented sequences with known localization showed that PSORTm maintains PSORTb's high precision, while sensitivity increases proportionately with metagenomic sequence fragment length. PSORTm's read-based analysis was similar to PSORTb-based analysis of metagenome-assembled genomes (MAGs); however, the latter requires non-trivial manual classification of each MAG by cell envelope, and cannot make use of unassembled sequences. Analysis of the watershed samples revealed the importance of normalization and identified potential biomarkers of water quality. This method should be useful for examining a wide range of microbial communities, including human microbiomes, and other microbiomes of medical, environmental or industrial importance. AVAILABILITY AND IMPLEMENTATION: Documentation, source code and docker containers are available for running PSORTm locally at https://www.psort.org/psortm/ (freely available, open-source software under GNU General Public License Version 3). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Archaea , Metagenomics , Archaea/genetics , Bacteria/genetics , Humans , Metagenome , Software
2.
mSphere ; 2(4)2017.
Article in English | MEDLINE | ID: mdl-28776042

ABSTRACT

Legionella spp. present in some human-made water systems can cause Legionnaires' disease in susceptible individuals. Although legionellae have been isolated from the natural environment, variations in the organism's abundance over time and its relationship to aquatic microbiota are poorly understood. Here, we investigated the presence and diversity of legionellae through 16S rRNA gene amplicon and metagenomic sequencing of DNA from isolates collected from seven sites in three watersheds with varied land uses over a period of 1 year. Legionella spp. were found in all watersheds and sampling sites, comprising up to 2.1% of the bacterial community composition. The relative abundance of Legionella tended to be higher in pristine sites than in sites affected by agricultural activity. The relative abundance levels of Amoebozoa, some of which are natural hosts of legionellae, were similarly higher in pristine sites. Compared to other bacterial genera detected, Legionella had both the highest richness and highest alpha diversity. Our findings indicate that a highly diverse population of legionellae may be found in a variety of natural aquatic sources. Further characterization of these diverse natural populations of Legionella will help inform prevention and control efforts aimed at reducing the risk of Legionella colonization of built environments, which could ultimately decrease the risk of human disease. IMPORTANCE Many species of Legionella can cause Legionnaires' disease, a significant cause of bacterial pneumonia. Legionella in human-made water systems such as cooling towers and building plumbing systems are the primary sources of Legionnaires' disease outbreaks. In this temporal study of natural aquatic environments, Legionella relative abundance was shown to vary in watersheds associated with different land uses. Analysis of the Legionella sequences detected at these sites revealed highly diverse populations that included potentially novel Legionella species. These findings have important implications for understanding the ecology of Legionella and control measures for this pathogen that are aimed at reducing human disease.

3.
Microbiome ; 4(1): 20, 2016 07 19.
Article in English | MEDLINE | ID: mdl-27391119

ABSTRACT

BACKGROUND: Studies of environmental microbiota typically target only specific groups of microorganisms, with most focusing on bacteria through taxonomic classification of 16S rRNA gene sequences. For a more holistic understanding of a microbiome, a strategy to characterize the viral, bacterial, and eukaryotic components is necessary. RESULTS: We developed a method for metagenomic and amplicon-based analysis of freshwater samples involving the concentration and size-based separation of eukaryotic, bacterial, and viral fractions. Next-generation sequencing and culture-independent approaches were used to describe and quantify microbial communities in watersheds with different land use in British Columbia. Deep amplicon sequencing was used to investigate the distribution of certain viruses (g23 and RdRp), bacteria (16S rRNA and cpn60), and eukaryotes (18S rRNA and ITS). Metagenomic sequencing was used to further characterize the gene content of the bacterial and viral fractions at both taxonomic and functional levels. CONCLUSION: This study provides a systematic approach to separate and characterize eukaryotic-, bacterial-, and viral-sized particles. Methodologies described in this research have been applied in temporal and spatial studies to study the impact of land use on watershed microbiomes in British Columbia.


Subject(s)
Bacteria/classification , Eukaryota/classification , Fresh Water/microbiology , Microbiota/genetics , Viruses/classification , Water Pollution/analysis , Bacteria/genetics , Base Sequence/genetics , British Columbia , DNA, Intergenic/genetics , Eukaryota/genetics , High-Throughput Nucleotide Sequencing/methods , Metagenome/genetics , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 18S/genetics , Sequence Analysis, DNA/methods , Viruses/genetics , Water Microbiology
4.
Nucleic Acids Res ; 44(D1): D663-8, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26602691

ABSTRACT

Protein subcellular localization (SCL) is important for understanding protein function, genome annotation, and has practical applications such as identification of potential vaccine components or diagnostic/drug targets. PSORTdb (http://db.psort.org) comprises manually curated SCLs for proteins which have been experimentally verified (ePSORTdb), as well as pre-computed SCL predictions for deduced proteomes from bacterial and archaeal complete genomes available from NCBI (cPSORTdb). We now report PSORTdb 3.0. It features improvements increasing user-friendliness, and further expands both ePSORTdb and cPSORTdb with a focus on improving protein SCL data in cases where it is most difficult-proteins associated with non-classical Gram-positive/Gram-negative/Gram-variable cell envelopes. ePSORTdb data curation was expanded, including adding in additional cell envelope localizations, and incorporating markers for cPSORTdb to automatically computationally identify if new genomes to be analysed fall into certain atypical cell envelope categories (i.e. Deinococcus-Thermus, Thermotogae, Corynebacteriales/Corynebacterineae, including Mycobacteria). The number of predicted proteins in cPSORTdb has increased from 3,700,000 when PSORTdb 2.0 was released to over 13,000,000 currently. PSORTdb 3.0 will be of wider use to researchers studying a greater diversity of monoderm or diderm microbes, including medically, agriculturally and industrially important species that have non-classical outer membranes or other cell envelope features.


Subject(s)
Archaeal Proteins/genetics , Bacterial Proteins/genetics , Databases, Protein , Membrane Proteins/genetics , Archaeal Proteins/analysis , Bacterial Proteins/analysis , Cell Membrane/chemistry , Cell Wall/chemistry , Genome, Archaeal , Genome, Bacterial , Membrane Proteins/analysis
5.
BMC Bioinformatics ; 16: 363, 2015 Nov 04.
Article in English | MEDLINE | ID: mdl-26537885

ABSTRACT

BACKGROUND: The field of metagenomics (study of genetic material recovered directly from an environment) has grown rapidly, with many bioinformatics analysis methods being developed. To ensure appropriate use of such methods, robust comparative evaluation of their accuracy and features is needed. For taxonomic classification of sequence reads, such evaluation should include use of clade exclusion, which better evaluates a method's accuracy when identical sequences are not present in any reference database, as is common in metagenomic analysis. To date, relatively small evaluations have been performed, with evaluation approaches like clade exclusion limited to assessment of new methods by the authors of the given method. What is needed is a rigorous, independent comparison between multiple major methods, using the same in silico and in vitro test datasets, with and without approaches like clade exclusion, to better characterize accuracy under different conditions. RESULTS: An overview of the features of 38 bioinformatics methods is provided, evaluating accuracy with a focus on 11 programs that have reference databases that can be modified and therefore most robustly evaluated with clade exclusion. Taxonomic classification of sequence reads was evaluated using both in silico and in vitro mock bacterial communities. Clade exclusion was used at taxonomic levels from species to class-identifying how well methods perform in progressively more difficult scenarios. A wide range of variability was found in the sensitivity, precision, overall accuracy, and computational demand for the programs evaluated. In experiments where distilled water was spiked with only 11 bacterial species, frequently dozens to hundreds of species were falsely predicted by the most popular programs. The different features of each method (forces predictions or not, etc.) are summarized, and additional analysis considerations discussed. CONCLUSIONS: The accuracy of shotgun metagenomics classification methods varies widely. No one program clearly outperformed others in all evaluation scenarios; rather, the results illustrate the strengths of different methods for different purposes. Researchers must appreciate method differences, choosing the program best suited for their particular analysis to avoid very misleading results. Use of standardized datasets for method comparisons is encouraged, as is use of mock microbial community controls suitable for a particular metagenomic analysis.


Subject(s)
Bacteria/genetics , Computational Biology/methods , Computer Simulation , Metagenomics/methods , Base Sequence , Databases, Genetic , Phylogeny , Species Specificity
6.
Front Microbiol ; 6: 1405, 2015.
Article in English | MEDLINE | ID: mdl-26733955

ABSTRACT

Select bacteria, such as Escherichia coli or coliforms, have been widely used as sentinels of low water quality; however, there are concerns regarding their predictive accuracy for the protection of human and environmental health. To develop improved monitoring systems, a greater understanding of bacterial community structure, function, and variability across time is required in the context of different pollution types, such as agricultural and urban contamination. Here, we present a year-long survey of free-living bacterial DNA collected from seven sites along rivers in three watersheds with varying land use in Southwestern Canada. This is the first study to examine the bacterial metagenome in flowing freshwater (lotic) environments over such a time span, providing an opportunity to describe bacterial community variability as a function of land use and environmental conditions. Characteristics of the metagenomic data, such as sequence composition and average genome size (AGS), vary with sampling site, environmental conditions, and water chemistry. For example, AGS was correlated with hours of daylight in the agricultural watershed and, across the agriculturally and urban-affected sites, k-mer composition clustering corresponded to nutrient concentrations. In addition to indicating a community shift, this change in AGS has implications in terms of the normalization strategies required, and considerations surrounding such strategies in general are discussed. When comparing abundances of gene functional groups between high- and low-quality water samples collected from an agricultural area, the latter had a higher abundance of nutrient metabolism and bacteriophage groups, possibly reflecting an increase in agricultural runoff. This work presents a valuable dataset representing a year of monthly sampling across watersheds and an analysis targeted at establishing a foundational understanding of how bacterial lotic communities vary across time and land use. The results provide important context for future studies, including further analyses of watershed ecosystem health, and the identification and development of biomarkers for improved water quality monitoring systems.

SELECTION OF CITATIONS
SEARCH DETAIL
...