Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 58
Filter
1.
Nucleic Acids Res ; 50(W1): W541-W550, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35639517

ABSTRACT

Most bacteria and archaea possess multiple antiviral defence systems that protect against infection by phages, archaeal viruses and mobile genetic elements. Our understanding of the diversity of defence systems has increased greatly in the last few years, and many more systems likely await discovery. To identify defence-related genes, we recently developed the Prokaryotic Antiviral Defence LOCator (PADLOC) bioinformatics tool. To increase the accessibility of PADLOC, we describe here the PADLOC web server (freely available at https://padloc.otago.ac.nz), allowing users to analyse whole genomes, metagenomic contigs, plasmids, phages and archaeal viruses. The web server includes a more than 5-fold increase in defence system types detected (since the first release) and expanded functionality enabling detection of CRISPR arrays and retron ncRNAs. Here, we provide user information such as input options, description of the multiple outputs, limitations and considerations for interpretation of the results, and guidance for subsequent analyses. The PADLOC web server also houses a precomputed database of the defence systems in > 230,000 RefSeq genomes. These data reveal two taxa, Campylobacterota and Spriochaetota, with unusual defence system diversity and abundance. Overall, the PADLOC web server provides a convenient and accessible resource for the detection of antiviral defence systems.


Subject(s)
Archaea , Bacteria , Genome, Microbial , Genomics , Internet , Software , Archaea/genetics , Archaea/virology , Bacteria/genetics , Bacteria/virology , Bacteriophages/immunology , Genome, Microbial/genetics , Plasmids/genetics , Prokaryotic Cells/metabolism , Prokaryotic Cells/virology , Computers , Genomics/methods
2.
Sci Rep ; 12(1): 824, 2022 01 17.
Article in English | MEDLINE | ID: mdl-35039534

ABSTRACT

Metagenomic sequencing methods provide considerable genomic information regarding human microbiomes, enabling us to discover and understand microbial diseases. Compositional differences have been reported between patients and healthy people, which could be used in the diagnosis of patients. Despite significant progress in this regard, the accuracy of these tools needs to be improved for applications in diagnostics and therapeutics. MDL4Microbiome, the method developed herein, demonstrated high accuracy in predicting disease status by using various features from metagenome sequences and a multimodal deep learning model. We propose combining three different features, i.e., conventional taxonomic profiles, genome-level relative abundance, and metabolic functional characteristics, to enhance classification accuracy. This deep learning model enabled the construction of a classifier that combines these various modalities encoded in the human microbiome. We achieved accuracies of 0.98, 0.76, 0.84, and 0.97 for predicting patients with inflammatory bowel disease, type 2 diabetes, liver cirrhosis, and colorectal cancer, respectively; these are comparable or higher than classical machine learning methods. A deeper analysis was also performed on the resulting sets of selected features to understand the contribution of their different characteristics. MDL4Microbiome is a classifier with higher or comparable accuracy compared with other machine learning methods, which offers perspectives on feature generation with metagenome sequences in deep learning models and their advantages in the classification of host disease status.


Subject(s)
Colorectal Neoplasms/microbiology , Deep Learning , Diabetes Mellitus, Type 2/microbiology , Genome, Microbial/genetics , Healthy Volunteers , Inflammatory Bowel Diseases/microbiology , Liver Cirrhosis/microbiology , Metagenome/genetics , Metagenomics/methods , Microbiota/genetics , Colorectal Neoplasms/diagnosis , Diabetes Mellitus, Type 2/diagnosis , Humans , Inflammatory Bowel Diseases/diagnosis , Liver Cirrhosis/diagnosis
3.
Sci Rep ; 12(1): 938, 2022 01 18.
Article in English | MEDLINE | ID: mdl-35042879

ABSTRACT

Molecular epidemiology using genomic data can help identify relationships between malaria parasite population structure, malaria transmission intensity, and ultimately help generate actionable data to assess the effectiveness of malaria control strategies. Genomic data, coupled with geographic information systems data, can further identify clusters or hotspots of malaria transmission, parasite genetic and spatial connectivity, and parasite movement by human or mosquito mobility over time and space. In this study, we performed longitudinal genomic surveillance in a cohort of 70 participants over four years from different neighborhoods and households in Thiès, Senegal-a region of exceptionally low malaria transmission (entomological inoculation rate less than 1). Genetic identity (identity by state, IBS) was established using a 24-single nucleotide polymorphism molecular barcode, identity by descent was calculated from whole genome sequence data, and a hierarchical Bayesian regression model was used to establish genetic and spatial relationships. Our results show clustering of genetically similar parasites within households and a decline in genetic similarity of parasites with increasing distance. One household showed extremely high diversity and warrants further investigation as to the source of these diverse genetic types. This study illustrates the utility of genomic data with traditional epidemiological approaches for surveillance and detection of trends and patterns in malaria transmission not only by neighborhood but also by household. This approach can be implemented regionally and countrywide to strengthen and support malaria control and elimination efforts.


Subject(s)
Genomics/methods , Malaria/transmission , Plasmodium falciparum/genetics , Adolescent , Animals , Child , Child, Preschool , Cluster Analysis , Cohort Studies , Female , Genome, Microbial/genetics , Genotype , Humans , Malaria/epidemiology , Malaria/parasitology , Malaria, Falciparum/parasitology , Male , Molecular Epidemiology/methods , Physical Distancing , Polymorphism, Single Nucleotide/genetics , Senegal/epidemiology
4.
Sci Rep ; 11(1): 20740, 2021 10 20.
Article in English | MEDLINE | ID: mdl-34671046

ABSTRACT

Assembling high-quality microbial genomes using only cost-effective Nanopore long-read systems such as Flongle is important to accelerate research on the microbial genome and the most critical point for this is the polishing process. In this study, we performed an evaluation based on BUSCO and Prokka gene prediction in terms of microbial genome assembly for eight state-of-the-art Nanopore polishing tools and combinations available. In the evaluation of individual tools, Homopolish, PEPPER, and Medaka demonstrated better results than others. In combination polishing, the second round Homopolish, and the PEPPER × medaka combination also showed better results than others. However, individual tools and combinations have specific limitations on usage and results. Depending on the target organism and the purpose of the downstream research, it is confirmed that there remain some difficulties in perfectly replacing the hybrid polishing carried out by the addition of a short-read. Nevertheless, through continuous improvement of the protein pores, related base-calling algorithms, and polishing tools based on improved error models, a high-quality microbial genome can be achieved using only Nanopore reads without the production of additional short-read data. The polishing strategy proposed in this study is expected to provide useful information for assembling the microbial genome using only Nanopore reads depending on the target microorganism and the purpose of the research.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Nanopore Sequencing/methods , Sequence Analysis, DNA/methods , Algorithms , Genome, Microbial/genetics , Genomics/methods , Nanopores
5.
J Microbiol Biotechnol ; 31(7): 903-911, 2021 Jul 28.
Article in English | MEDLINE | ID: mdl-34261850

ABSTRACT

Previous studies have modified microbial genomes by introducing gene cassettes containing selectable markers and homologous DNA fragments. However, this requires several steps including homologous recombination and excision of unnecessary DNA regions, such as selectable markers from the modified genome. Further, genomic manipulation often leaves scars and traces that interfere with downstream iterative genome engineering. A decade ago, the CRISPR/Cas system (also known as the bacterial adaptive immune system) revolutionized genome editing technology. Among the various CRISPR nucleases of numerous bacteria and archaea, the Cas9 and Cas12a (Cpf1) systems have been largely adopted for genome editing in all living organisms due to their simplicity, as they consist of a single polypeptide nuclease with a target-recognizing RNA. However, accurate and fine-tuned genome editing remains challenging due to mismatch tolerance and protospacer adjacent motif (PAM)-dependent target recognition. Therefore, this review describes how to overcome the aforementioned hurdles, which especially affect genome editing in higher organisms. Additionally, the biological significance of CRISPR-mediated microbial genome editing is discussed, and future research and development directions are also proposed.


Subject(s)
CRISPR-Cas Systems , Gene Editing , Genome, Microbial/genetics , Base Pair Mismatch , CRISPR-Associated Protein 9/chemistry , CRISPR-Associated Protein 9/genetics , CRISPR-Associated Protein 9/metabolism , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Nucleotide Motifs , RNA, Guide, Kinetoplastida/chemistry , RNA, Guide, Kinetoplastida/metabolism
6.
Methods Mol Biol ; 2328: 277-285, 2021.
Article in English | MEDLINE | ID: mdl-34251633

ABSTRACT

The identification and characterization of non-coding RNAs (ncRNAs) in prokaryotes is an important step in the study of the interaction of these molecules with mRNAs-or target proteins, in the post-transcriptional regulation process. Here, we describe one of the main in silico prediction methods in prokaryotes, using the TargetRNA2 tool to predict target mRNAs.


Subject(s)
Computational Biology/methods , Gene Expression Profiling/methods , Genome, Microbial/genetics , Prokaryotic Cells/metabolism , RNA, Untranslated/metabolism , Computer Simulation , Databases, Genetic , Molecular Sequence Annotation , RNA, Untranslated/genetics , RNA-Seq , Software
7.
Mol Biotechnol ; 63(6): 459-476, 2021 Jun.
Article in English | MEDLINE | ID: mdl-33774733

ABSTRACT

Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated Cas protein technology area is rapidly growing technique for genome editing and modulation of transcription of several microbes. Successful engineering in microbes requires an emphasis on the aspect of efficiency and targeted aiming, which can be employed using CRISPR/Cas system. Hence, this type of system is used to modify the genome of several microbes such as yeast and bacteria. In recent years, CRISPR/Cas systems have been chosen for metabolic engineering in microbes due to their specificity, orthogonality, and efficacy. Therefore, we need to review the scheme which was acquired for the execution of the CRISPR/Cas system for the modification and metabolic engineering in yeast and bacteria. In this review, we highlighted the application of the CRISPR/Cas system which has been used for the production of small molecules in the microbial system that is chemically and biologically important.


Subject(s)
CRISPR-Cas Systems/genetics , Gene Editing/methods , Metabolic Engineering/methods , Small Molecule Libraries/metabolism , Bacteria/genetics , Genome, Microbial/genetics , Yeasts/genetics
8.
PLoS Comput Biol ; 17(2): e1008727, 2021 02.
Article in English | MEDLINE | ID: mdl-33635857

ABSTRACT

Low-cost, high-throughput sequencing has led to an enormous increase in the number of sequenced microbial genomes, with well over 100,000 genomes in public archives today. Automatic genome annotation tools are integral to understanding these organisms, yet older gene finding methods must be retrained on each new genome. We have developed a universal model of prokaryotic genes by fitting a temporal convolutional network to amino-acid sequences from a large, diverse set of microbial genomes. We incorporated the new model into a gene finding system, Balrog (Bacterial Annotation by Learned Representation Of Genes), which does not require genome-specific training and which matches or outperforms other state-of-the-art gene finding tools. Balrog is freely available under the MIT license at https://github.com/salzberg-lab/Balrog.


Subject(s)
Gene Expression Profiling , Genome, Bacterial , Genome, Microbial/genetics , Genomics/methods , Prokaryotic Cells , Algorithms , Computational Biology , Computer Simulation , Genome , Genome, Archaeal , Molecular Sequence Annotation , Open Reading Frames , Programming Languages , Protein Biosynthesis , Software
9.
BMC Bioinformatics ; 22(1): 11, 2021 Jan 06.
Article in English | MEDLINE | ID: mdl-33407081

ABSTRACT

BACKGROUND: High-throughput sequencing has increased the number of available microbial genomes recovered from isolates, single cells, and metagenomes. Accordingly, fast and comprehensive functional gene annotation pipelines are needed to analyze and compare these genomes. Although several approaches exist for genome annotation, these are typically not designed for easy incorporation into analysis pipelines, do not combine results from different annotation databases or offer easy-to-use summaries of metabolic reconstructions, and typically require large amounts of computing power for high-throughput analysis not available to the average user. RESULTS: Here, we introduce MicrobeAnnotator, a fully automated, easy-to-use pipeline for the comprehensive functional annotation of microbial genomes that combines results from several reference protein databases and returns the matching annotations together with key metadata such as the interlinked identifiers of matching reference proteins from multiple databases [KEGG Orthology (KO), Enzyme Commission (E.C.), Gene Ontology (GO), Pfam, and InterPro]. Further, the functional annotations are summarized into Kyoto Encyclopedia of Genes and Genomes (KEGG) modules as part of a graphical output (heatmap) that allows the user to quickly detect differences among (multiple) query genomes and cluster the genomes based on their metabolic similarity. MicrobeAnnotator is implemented in Python 3 and is freely available under an open-source Artistic License 2.0 from https://github.com/cruizperez/MicrobeAnnotator . CONCLUSIONS: We demonstrated the capabilities of MicrobeAnnotator by annotating 100 Escherichia coli and 78 environmental Candidate Phyla Radiation (CPR) bacterial genomes and comparing the results to those of other popular tools. We showed that the use of multiple annotation databases allows MicrobeAnnotator to recover more annotations per genome compared to faster tools that use reduced databases and is computationally efficient for use in personal computers. The output of MicrobeAnnotator can be easily incorporated into other analysis pipelines while the results of other annotation tools can be seemingly incorporated into MicrobeAnnotator to generate summary plots.


Subject(s)
Genome, Microbial/genetics , Genomics/methods , Molecular Sequence Annotation/methods , Software , Escherichia coli/genetics
10.
Curr Opin Chem Biol ; 60: 47-54, 2021 02.
Article in English | MEDLINE | ID: mdl-32853968

ABSTRACT

The advent of the genomic era has opened up enormous possibilities for the discovery of new natural products. Also known as specialized metabolites, these compounds produced by bacteria, fungi, and plants have long been sought for their bioactive properties. Innovations in both DNA sequencing technologies and bioinformatics now allow the wealth of sequence data to be mined at both the genome and metagenome levels for new specialized metabolites. However, a key problem that remains is rapidly and efficiently linking these identified genes to their corresponding compounds. Within this review, we provide specific examples of studies that have used the power of genomic or metagenomic data to overcome these problems and identify new small molecules and their biosynthetic pathways.


Subject(s)
Biological Products , Data Mining/methods , Drug Discovery/methods , Genome, Microbial/genetics , Biological Products/metabolism
11.
Trends Biotechnol ; 39(2): 165-180, 2021 02.
Article in English | MEDLINE | ID: mdl-32680590

ABSTRACT

Genome engineering is crucial for answering fundamental questions about, and exploring practical applications of, microorganisms. Various microbial genome-engineering tools, including CRISPR/Cas-enhanced homologous recombination (HR), have been developed, with ever-improving simplicity, efficiency, and applicability. Recently, a powerful emerging technology based on CRISPR/Cas-nucleobase deaminase fusions, known as base editing, opened new avenues for microbial genome engineering. Base editing enables nucleotide transition without inducing lethal double-stranded (ds)DNA cleavage, adding foreign donor DNA, or depending on inefficient HR. Here, we review ongoing efforts to develop and apply base editing to engineer industrially and clinically relevant microorganisms. We also summarize bioinformatics tools that would greatly facilitate guide (g)RNA design and sequencing data analysis and discuss the future challenges and prospects associated with this technology.


Subject(s)
CRISPR-Cas Systems , Gene Editing , Genome, Microbial , Gene Editing/trends , Genome, Microbial/genetics , Homologous Recombination , Technology/trends
12.
Environ Microbiol ; 22(9): 4000-4013, 2020 09.
Article in English | MEDLINE | ID: mdl-32761733

ABSTRACT

Assembling microbial and viral genomes from metagenomes is a powerful and appealing method to understand structure-function relationships in complex environments. To compare the recovery of genomes from microorganisms and their viruses from groundwater, we generated shotgun metagenomes with Illumina sequencing accompanied by long reads derived from the Oxford Nanopore Technologies (ONT) sequencing platform. Assembly and metagenome-assembled genome (MAG) metrics for both microbes and viruses were determined from an Illumina-only assembly, ONT-only assembly, and a hybrid assembly approach. The hybrid approach recovered 2× more mid to high-quality MAGs compared to the Illumina-only approach and 4× more than the ONT-only approach. A similar number of viral genomes were reconstructed using the hybrid and ONT methods, and both recovered nearly fourfold more viral genomes than the Illumina-only approach. While yielding fewer MAGs, the ONT-only approach generated MAGs with a high probability of containing rRNA genes, 3× higher than either of the other methods. Of the shared MAGs recovered from each method, the ONT-only approach generated the longest and least fragmented MAGs, while the hybrid approach yielded the most complete. This work provides quantitative data to inform a cost-benefit analysis of the decision to supplement shotgun metagenomic projects with long reads towards the goal of recovering genomes from environmentally abundant groups.


Subject(s)
Genome, Microbial/genetics , Groundwater/microbiology , Metagenome/genetics , Nanopore Sequencing , Groundwater/virology , High-Throughput Nucleotide Sequencing , Metagenomics , Water Microbiology , Whole Genome Sequencing
13.
ILAR J ; 60(2): 289-297, 2020 10 19.
Article in English | MEDLINE | ID: mdl-32706377

ABSTRACT

Our bodies and those of our animal research subjects are colonized by bacterial communities that occupy virtually every organ system, including many previously considered sterile. These bacteria reside as complex communities that are collectively referred to as microbiota. Prior to the turn of the century, characterization of these communities was limited by a reliance on culture of organisms on a battery of selective media. It was recognized that the vast majority of microbes, especially those occupying unique niches of the body such as the anaerobic environment of the intestinal tract, were uncultivatable. However, with the onset and advancement of next-generation sequencing technology, we are now capable of characterizing these complex communities without the need to cultivate, and this has resulted in an explosion of information and new challenges in interpreting data generated about, and in the context of, these complex communities. We have long known that these microbial communities often exist in an intricate balance that, if disrupted (ie, dysbiosis), can lead to disease or increased susceptibility to disease. Because of many functional redundancies, the makeup of these colonies can vary dramatically within healthy individuals [1]. However, there is growing evidence that subtle differences can alter the phenotype of various animal models, which may translate to the varying susceptibility to disease seen in the human population. In this manuscript, we discuss how to include complex microbiota as a consideration in experimental design and model reproducibility and how to exploit the extensive variation that exists in contemporary rodent research colonies. Our focus will be the intestinal or gut microbiota (GM), but it should be recognized that microbial communities exist in many other body compartments and these too likely influence health and disease [2, 3]. Much like host genetics, can we one day harness the vast genetic capacity of the microbes we live with in ways that will benefit human and animal health?


Subject(s)
Gastrointestinal Microbiome/genetics , Genome, Microbial/genetics , Animals , Humans , Models, Animal
14.
Sci Rep ; 10(1): 7712, 2020 05 07.
Article in English | MEDLINE | ID: mdl-32382098

ABSTRACT

The annotation of short-reads metagenomes is an essential process to understand the functional potential of sequenced microbial communities. Annotation techniques based solely on the identification of local matches tend to confound local sequence similarity and overall protein homology and thus don't mirror the complex multidomain architecture and the shuffling of functional domains in many protein families. Here, we present MetaGeneHunt to identify specific protein domains and to normalize the hit-counts based on the domain length. We used MetaGeneHunt to investigate the potential for carbohydrate processing in the mouse gastrointestinal tract. We sampled, sequenced, and analyzed the microbial communities associated with the bolus in the stomach, intestine, cecum, and colon of five captive mice. Focusing on Glycoside Hydrolases (GHs) we found that, across samples, 58.3% of the 4,726,023 short-read sequences matching with a GH domain-containing protein were located outside the domain of interest. Next, before comparing the samples, the counts of localized hits matching the domains of interest were normalized to account for the corresponding domain length. Microbial communities in the intestine and cecum displayed characteristic GH profiles matching distinct microbial assemblages. Conversely, the stomach and colon were associated with structurally and functionally more diverse and variable microbial communities. Across samples, despite fluctuations, changes in the functional potential for carbohydrate processing correlated with changes in community composition. Overall MetaGeneHunt is a new way to quickly and precisely identify discrete protein domains in sequenced metagenomes processed with MG-RAST. In addition, using the sister program "GeneHunt" to create custom Reference Annotation Table, MetaGeneHunt provides an unprecedented way to (re)investigate the precise distribution of any protein domain in short-reads metagenomes.


Subject(s)
Bacteria/genetics , Metagenome/genetics , Protein Domains/genetics , Software , Animals , Computational Biology , Databases, Genetic , Genome, Microbial/genetics , Metagenomics/methods , Mice , Microbiota/genetics , Molecular Sequence Annotation , Sequence Analysis
15.
BMC Genomics ; 21(1): 334, 2020 Apr 29.
Article in English | MEDLINE | ID: mdl-32349659

ABSTRACT

BACKGROUND: The rnpB gene encodes for an essential catalytic RNA (RNase P). Like other essential RNAs, RNase P's sequence is highly variable. However, unlike other essential RNAs (i.e. tRNA, 16 S, 6 S,...) its structure is also variable with at least 5 distinct structure types observed in prokaryotes. This structural variability makes it labor intensive and challenging to create and maintain covariance models for the detection of RNase P RNA in genomic and metagenomic sequences. The lack of a facile and rapid annotation algorithm has led to the rnpB gene being the most grossly under annotated essential gene in completed prokaryotic genomes with only a 24% annotation rate. Here we describe the coupling of the largest RNase P RNA database with the local alignment scoring algorithm to create the most sensitive and rapid prokaryote rnpB gene identification and annotation algorithm to date. RESULTS: Of the 2772 completed microbial genomes downloaded from GenBank only 665 genomes had an annotated rnpB gene. We applied P Finder to these genomes and were able to identify 2733 or nearly 99% of the 2772 microbial genomes examined. From these results four new rnpB genes that encode the minimal T-type P RNase P RNAs were identified computationally for the first time. In addition, only the second C-type RNase P RNA was identified in Sphaerobacter thermophilus. Of special note, no RNase P RNAs were detected in several obligate endosymbionts of sap sucking insects suggesting a novel evolutionary adaptation. CONCLUSIONS: The coupling of the largest RNase P RNA database and associated structure class identification with the P Finder algorithm is both sensitive and rapid, yielding high quality results to aid researchers annotating either genomic or metagenomic data. It is the only algorithm to date that can identify challenging RNAse P classes such as C-type and the minimal T-type RNase P RNAs. P Finder is written in C# and has a user-friendly GUI that can run on multiple 64-bit windows platforms (Windows Vista/7/8/10). P Finder is free available for download at https://github.com/JChristopherEllis/P-Finder as well as a small sample RNase P RNA file for testing.


Subject(s)
Genes, Microbial , Genomics/methods , Ribonuclease P/genetics , Algorithms , Chloroflexi/enzymology , Chloroflexi/genetics , Databases, Genetic , Genome, Microbial/genetics , Metagenomics/methods , Nucleic Acid Conformation , Prokaryotic Cells/enzymology , RNA, Catalytic/chemistry , RNA, Catalytic/classification , RNA, Catalytic/genetics , Ribonuclease P/chemistry , Ribonuclease P/classification , Software
16.
Arch Microbiol ; 202(1): 31-41, 2020 Jan.
Article in English | MEDLINE | ID: mdl-31456050

ABSTRACT

Anaerobic digestion, a recently hot technology to produce biogases especially methane generation for biofuel from wastewater, is considered an effective explanation for energy crisis and global pollution threat. A complex microbiome population is present in sludge, which plays an important role in the digestion of complex polymer into simple monomers. 16S rRNA approaches simply are not enough for amplification due to the involvement of extreme complex population. However, Illumina sequencing is a recent powerful technology to reveal the entire microbiome structure and methane generation pathways in anaerobic digestion. Metagenomic sequencing was tested to reveal the microbial structure of a digested sludge from a local wastewater treatment plant in Beijing. The Illumina HiSeq program was used to extract about 5 GB of data for metagenomic analysis. The classification investigation revealed about 97.64% dominancy of bacteria while 1.78% were detected to be archaea using MG-RAST server. The most abundant bacterial communities were reported to be Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria. Furthermore, the important microbiome involved in methane generation was revealed. The dominant methanogens were detected (Methanosaeta and Methanosarcina), with affiliation of dominant genes involved in acetoclastic methanogenesis in a digesting sludge. The metagenomic analysis showed that microbial structure and methane generation pathways were successfully dissected in an anaerobic digester.


Subject(s)
Biofuels/microbiology , Genome, Microbial/genetics , Metagenome/genetics , Methane/metabolism , Sewage/microbiology , Anaerobiosis , Archaea/genetics , Bacteria/genetics , High-Throughput Nucleotide Sequencing , RNA, Ribosomal, 16S/genetics , Sequence Analysis, DNA
17.
Nucleic Acids Res ; 47(10): e57, 2019 06 04.
Article in English | MEDLINE | ID: mdl-30838416

ABSTRACT

Shotgun metagenomics is a powerful, high-resolution technique enabling the study of microbial communities in situ. However, species-level resolution is only achieved after a process of 'binning' where contigs predicted to originate from the same genome are clustered. Such culture-independent sequencing frequently unearths novel microbes, and so various methods have been devised for reference-free binning. As novel microbiomes of increasing complexity are explored, sometimes associated with non-model hosts, robust automated binning methods are required. Existing methods struggle with eukaryotic contamination and cannot handle highly complex single metagenomes. We therefore developed an automated binning pipeline, termed 'Autometa', to address these issues. This command-line application integrates sequence homology, nucleotide composition, coverage and the presence of single-copy marker genes to separate microbial genomes from non-model host genomes and other eukaryotic contaminants, before deconvoluting individual genomes from single metagenomes. The method is able to effectively separate over 1000 genomes from a metagenome, allowing the study of previously intractably complex environments at the level of single species. Autometa is freely available at https://bitbucket.org/jason_c_kwan/autometa and as a docker image at https://hub.docker.com/r/jasonkwan/autometa under the GNU Affero General Public License 3 (AGPL 3).


Subject(s)
Algorithms , Computational Biology/methods , Genome, Microbial/genetics , Metagenome/genetics , Metagenomics/methods , Animals , Bacteria/classification , Bacteria/genetics , Cluster Analysis , Genome, Bacterial/genetics , Humans , Internet , Reproducibility of Results
18.
Appl Microbiol Biotechnol ; 103(8): 3277-3287, 2019 Apr.
Article in English | MEDLINE | ID: mdl-30859257

ABSTRACT

Secondary metabolites (SM) produced by fungi and bacteria have long been of exceptional interest owing to their unique biomedical ramifications. The traditional discovery of new natural products that was mainly driven by bioactivity screening has now experienced a fresh new approach in the form of genome mining. Several bioinformatics tools have been continuously developed to detect potential biosynthetic gene clusters (BGCs) that are responsible for the production of SM. Although the principles underlying the computation of these tools have been discussed, the biological background is left underrated and ambiguous. In this review, we emphasize the biological hypotheses in BGC formation driven from the observations across genomes in bacteria and fungi, and provide a comprehensive list of updated algorithms/tools exclusively for BGC detection. Our review points to a direction that the biological hypotheses should be systematically incorporated into the BGC prediction and assist the prioritization of candidate BGC.


Subject(s)
Bacteria/genetics , Computational Biology , Fungi/genetics , Multigene Family/genetics , Secondary Metabolism/genetics , Bacteria/chemistry , Bacteria/metabolism , Biological Products/metabolism , Drug Resistance, Microbial/genetics , Fungi/chemistry , Fungi/metabolism , Gene Duplication , Gene Transfer, Horizontal , Genome, Microbial/genetics
19.
J Microbiol Methods ; 155: 65-69, 2018 12.
Article in English | MEDLINE | ID: mdl-30452938

ABSTRACT

Although second generation biofuel technology is a sustainable route for bioethanol production it is not currently a robust technology because of certain hindrances viz., unavailability of potential enzyme resources, low efficiency of enzymes and restricted availability of potent enzymes that work under harsh conditions in industrial processes. Therefore, bioprospecting of extremophilic microorganisms using metagenomics is a promising alternative to discover novel microbes and enzymes with efficient tolerance to unfavourable conditions and thus could revolutionize the energy sector. Metagenomics a recent field in "omics" technology enables the genomic study of uncultured microorganisms with the goal of better understanding microbial dynamics. Metagenomics in conjunction with NextGen Sequencing technology facilitates the sequencing of microbial DNA directly from environmental samples and has expanded, and transformed our knowledge of the microbial world. However, filtering the meaningful information from the millions of genomic sequences offers a serious challenge to bioinformaticians. The current review holds the opinion tool 'know- how' to unravel the secrets of nature while expediting the bio-industrial world. We also discuss the novel biocatalytic agents discovered through metagenomics and how bioengineering plays a pivotal role to enhance their efficiency.


Subject(s)
Enzymes , Extremophiles/enzymology , Extremophiles/genetics , Genome, Microbial/genetics , Metagenomics/methods , Bioengineering/methods , Biofuels , Biotechnology , Enzyme Activation , Enzymes/genetics , Enzymes/metabolism , Genomics , Microbiological Techniques/methods , Microbiota/genetics
20.
J Clin Microbiol ; 56(11)2018 11.
Article in English | MEDLINE | ID: mdl-30135232

ABSTRACT

The rapid development of sequencing technologies has to led to an explosion of pathogen sequence data, which are increasingly collected as part of routine surveillance or clinical diagnostics. In public health, sequence data are used to reconstruct the evolution of pathogens, to anticipate future spread, and to target interventions. In clinical settings, whole-genome sequencing can identify pathogens at the strain level, can be used to predict phenotypes such as drug resistance and virulence, and can inform treatment by linking closely related cases. While sequencing has become cheaper, the analysis of sequence data has become an important bottleneck. Deriving interpretable and actionable results for a large variety of pathogens, each with its own complexity, from continuously updated data is a daunting task that requires flexible bioinformatic workflows and dissemination platforms. Here, we review recent developments in real-time analyses of pathogen sequence data, with a particular focus on the visualization and integration of sequence and phenotype data.


Subject(s)
Data Visualization , Genome, Microbial/genetics , Sequence Analysis, DNA , Computational Biology , Databases, Genetic , Humans , Infections/diagnosis , Infections/epidemiology , Infections/microbiology , Infections/virology , Molecular Epidemiology , Phylogeny , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...