Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
BMC Bioinformatics ; 20(1): 561, 2019 Nov 08.
Article in English | MEDLINE | ID: mdl-31703549

ABSTRACT

BACKGROUND: The MG-RAST API provides search capabilities and delivers organism and function data as well as raw or annotated sequence data via the web interface and its RESTful API. For casual users, however, RESTful APIs are hard to learn and work with. RESULTS: We created the graphical MG-RAST API explorer to help researchers more easily build and export API queries; understand the data abstractions and indices available in MG-RAST; and use the results presented in-browser for exploration, development, and debugging. CONCLUSIONS: The API explorer lowers the barrier to entry for occasional or first-time MG-RAST API users.


Subject(s)
Search Engine , Software , User-Computer Interface , Archaea/genetics , Base Sequence , Databases, Genetic , Internet
2.
Brief Bioinform ; 20(4): 1151-1159, 2019 07 19.
Article in English | MEDLINE | ID: mdl-29028869

ABSTRACT

As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1-3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the community's data analysis tasks.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Metagenome , Metagenomics/methods , Software , Algorithms , Budgets , Computational Biology/methods , High-Throughput Nucleotide Sequencing/economics , High-Throughput Nucleotide Sequencing/statistics & numerical data , Internet , Metagenomics/economics , Metagenomics/statistics & numerical data , Sequence Analysis, DNA/economics , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/statistics & numerical data , User-Computer Interface , Workflow
3.
Nucleic Acids Res ; 44(D1): D590-4, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26656948

ABSTRACT

MG-RAST (http://metagenomics.anl.gov) is an open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200,000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. To show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools.


Subject(s)
Databases, Nucleic Acid , Metagenomics , Internet , Sequence Alignment
4.
Environ Microbiol ; 16(11): 3443-62, 2014 Nov.
Article in English | MEDLINE | ID: mdl-24628880

ABSTRACT

We reconstructed the complete 2.4 Mb-long genome of a previously uncultivated epsilonproteobacterium, Candidatus Sulfuricurvum sp. RIFRC-1, via assembly of short-read shotgun metagenomic data using a complexity reduction approach. Genome-based comparisons indicate the bacterium is a novel species within the Sulfuricurvum genus, which contains one cultivated representative, S. kujiense. Divergence between the species appears due in part to extensive genomic rearrangements, gene loss and chromosomal versus plasmid encoding of certain (respiratory) genes by RIFRC-1. Deoxyribonucleic acid for the genome was obtained from terrestrial aquifer sediment, in which RIFRC-1 comprised ∼ 47% of the bacterial community. Genomic evidence suggests RIFRC-1 is a chemolithoautotrophic diazotroph capable of deriving energy for growth by microaerobic or nitrate-/nitric oxide-dependent oxidation of S°, sulfide or sulfite or H2oxidation. Carbon may be fixed via the reductive tricarboxylic acid cycle. Consistent with these physiological attributes, the local aquifer was microoxic with small concentrations of available nitrate, small but elevated concentrations of reduced sulfur and NH(4)(+) /NH3-limited. Additionally, various mechanisms for heavy metal and metalloid tolerance and virulence point to a lifestyle well-adapted for metal(loid)-rich environments and a shared evolutionary past with pathogenic Epsilonproteobacteria. Results expand upon recent findings highlighting the potential importance of sulfur and hydrogen metabolism in the terrestrial subsurface.


Subject(s)
Epsilonproteobacteria/genetics , Genome, Bacterial , Groundwater/microbiology , Base Sequence , Carbon/metabolism , Geologic Sediments/chemistry , Groundwater/chemistry , Hydrogen/metabolism , Metagenome , Metagenomics , Oxidation-Reduction , Plasmids/genetics , Sulfur/metabolism
5.
Methods Enzymol ; 531: 487-523, 2013.
Article in English | MEDLINE | ID: mdl-24060134

ABSTRACT

The democratized world of sequencing is leading to numerous data analysis challenges; MG-RAST addresses many of these challenges for diverse datasets, including amplicon datasets, shotgun metagenomes, and metatranscriptomes. The changes from version 2 to version 3 include the addition of a dedicated gene calling stage using FragGenescan, clustering of predicted proteins at 90% identity, and the use of BLAT for the computation of similarities. Together with changes in the underlying software infrastructure, this has enabled the dramatic scaling up of pipeline throughput while remaining on a limited hardware budget. The Web-based service allows upload, fully automated analysis, and visualization of results. As a result of the plummeting cost of sequencing and the readily available analytical power of MG-RAST, over 78,000 metagenomic datasets have been analyzed, with over 12,000 of them publicly available in MG-RAST.


Subject(s)
Computational Biology/methods , Metagenomics , Software , Bacteria/classification , Bacteria/genetics , Genome, Bacterial , High-Throughput Nucleotide Sequencing , Internet
6.
BMC Genomics ; 14: 537, 2013 Aug 08.
Article in English | MEDLINE | ID: mdl-23924250

ABSTRACT

BACKGROUND: The numerous classes of repeats often impede the assembly of genome sequences from the short reads provided by new sequencing technologies. We demonstrate a simple and rapid means to ascertain the repeat structure and total size of a bacterial or archaeal genome without the need for assembly by directly analyzing the abundances of distinct k-mers among reads. RESULTS: The sensitivity of this procedure to resolve variation within a bacterial species is demonstrated: genome sizes and repeat structure of five environmental strains of E. coli from short Illumina reads were estimated by this method, and total genome sizes corresponded well with those obtained for the same strains by pulsed-field gel electrophoresis. In addition, this approach was applied to read-sets for completed genomes and shown to be accurate over a wide range of microbial genome sizes. CONCLUSIONS: Application of these procedures, based solely on k-mer abundances in short read data sets, allows aspects of genome structure to be resolved that are not apparent from conventional short read assemblies. This knowledge of the repetitive content of genomes provides insights into genome evolution and diversity.


Subject(s)
Escherichia coli/genetics , Genome Size , Genomics , Repetitive Sequences, Nucleic Acid/genetics , Sequence Analysis , Gene Dosage/genetics , Genome, Bacterial/genetics , Time Factors
7.
J Bacteriol ; 194(24): 6986-7, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23209236

ABSTRACT

Agrobacterium albertimagni strain AOL15 is an alphaproteobacterium isolated from arsenite-oxidizing biofilms whose draft genome contains 5.1 Mb in 55 contigs with 61.2% GC content and includes a 21-gene arsenic gene island. This is the first available genome for this species and the second Agrobacterium arsenic gene island.


Subject(s)
Agrobacterium/genetics , Genome, Bacterial , Arsenites/metabolism , Base Composition , Base Sequence , Biofilms , DNA, Bacterial/genetics , Genomic Islands , Molecular Sequence Data , RNA, Bacterial/genetics , Sequence Alignment , Sequence Analysis, DNA , Soil Microbiology
8.
J Bacteriol ; 194(22): 6355, 2012 Nov.
Article in English | MEDLINE | ID: mdl-23105084

ABSTRACT

Achromobacter piechaudii strain HLE is a betaproteobacterium (previously known as Alcaligenes faecalis) that was an early isolate with arsenite oxidase activity. This draft genome of 6.89 Mb is the second available genome for this species in the opportunistic pathogen Alcaligenaceae family.


Subject(s)
Achromobacter/classification , Achromobacter/genetics , Genome, Bacterial , Molecular Sequence Data
9.
Genome Biol ; 13(9): 169, 2012.
Article in English | MEDLINE | ID: mdl-23013527

ABSTRACT

Meta-omics approaches such as metagenomics, metatranscriptomics and metaproteogenomics have the potential to improve our understanding of how the human microbiome affects digestive health and disease.

10.
PLoS One ; 7(8): e44326, 2012.
Article in English | MEDLINE | ID: mdl-22952955

ABSTRACT

The most feared complication following intestinal resection is anastomotic leakage. In high risk areas (esophagus/rectum) where neoadjuvant chemoradiation is used, the incidence of anastomotic leaks remains unacceptably high (≈ 10%) even when performed by specialist surgeons in high volume centers. The aims of this study were to test the hypothesis that anastomotic leakage develops when pathogens colonizing anastomotic sites become in vivo transformed to express a tissue destroying phenotype. We developed a novel model of anastomotic leak in which rats were exposed to pre-operative radiation as in cancer surgery, underwent distal colon resection and then were intestinally inoculated with Pseudomonas aeruginosa, a common colonizer of the radiated intestine. Results demonstrated that intestinal tissues exposed to preoperative radiation developed a significant incidence of anastomotic leak (>60%; p<0.01) when colonized by P. aeruginosa compared to radiated tissues alone (0%). Phenotype analysis comparing the original inoculating strain (MPAO1- termed P1) and the strain retrieved from leaking anastomotic tissues (termed P2) demonstrated that P2 was altered in pyocyanin production and displayed enhanced collagenase activity, high swarming motility, and a destructive phenotype against cultured intestinal epithelial cells (i.e. apoptosis, barrier function, cytolysis). Comparative genotype analysis between P1 and P2 revealed a single nucleotide polymorphism (SNP) mutation in the mexT gene that led to a stop codon resulting in a non-functional truncated protein. Replacement of the mutated mexT gene in P2 with mexT from the original parental strain P1 led to reversion of P2 to the P1 phenotype. No spontaneous transformation was detected during 20 passages in TSB media. Use of a novel virulence suppressing compound PEG/Pi prevented P. aeruginosa transformation to the tissue destructive phenotype and prevented anastomotic leak in rats. This work demonstrates that in vivo transformation of microbial pathogens to a tissue destroying phenotype may have important implications in the pathogenesis of anastomotic leak.


Subject(s)
Anastomotic Leak/microbiology , Intestines/microbiology , Mutation/genetics , Polymorphism, Single Nucleotide/genetics , Pseudomonas aeruginosa/genetics , Pseudomonas aeruginosa/pathogenicity , Anastomosis, Surgical/adverse effects , Anastomotic Leak/pathology , Animals , Apoptosis/drug effects , Base Sequence , Caenorhabditis elegans , Colon/drug effects , Colon/metabolism , Colon/pathology , Intestines/drug effects , Intestines/pathology , Intestines/ultrastructure , Male , Molecular Sequence Data , Phenotype , Phosphates/pharmacology , Polyethylene Glycols/pharmacology , Protective Agents/pharmacology , Pseudomonas aeruginosa/drug effects , Pseudomonas aeruginosa/isolation & purification , Radiation , Rats , Rats, Wistar , Tight Junctions/drug effects , Tight Junctions/metabolism , Wound Healing/drug effects , Zonula Occludens-1 Protein/metabolism
11.
J Bacteriol ; 194(18): 5153, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22933773

ABSTRACT

Alcaligenes faecalis subsp. faecalis NCIB 8687, the betaproteobacterium from which arsenite oxidase had its structure solved and the first "arsenate gene island" identified, provided a draft genome of 3.9 Mb in 186 contigs (with the largest 15 comprising 90% of the total) for this opportunistic pathogen species.


Subject(s)
Alcaligenes faecalis/genetics , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Genome, Bacterial , Sequence Analysis, DNA , Alcaligenes faecalis/isolation & purification , Molecular Sequence Data
12.
BMC Bioinformatics ; 13: 183, 2012 Jul 28.
Article in English | MEDLINE | ID: mdl-22839106

ABSTRACT

BACKGROUND: Gene prediction algorithms (or gene callers) are an essential tool for analyzing shotgun nucleic acid sequence data. Gene prediction is a ubiquitous step in sequence analysis pipelines; it reduces the volume of data by identifying the most likely reading frame for a fragment, permitting the out-of-frame translations to be ignored. In this study we evaluate five widely used ab initio gene-calling algorithms-FragGeneScan, MetaGeneAnnotator, MetaGeneMark, Orphelia, and Prodigal-for accuracy on short (75-1000 bp) fragments containing sequence error from previously published artificial data and "real" metagenomic datasets. RESULTS: While gene prediction tools have similar accuracies predicting genes on error-free fragments, in the presence of sequencing errors considerable differences between tools become evident. For error-containing short reads, FragGeneScan finds more prokaryotic coding regions than does MetaGeneAnnotator, MetaGeneMark, Orphelia, or Prodigal. This improved detection of genes in error-containing fragments, however, comes at the cost of much lower (50%) specificity and overprediction of genes in noncoding regions. CONCLUSIONS: Ab initio gene callers offer a significant reduction in the computational burden of annotating individual nucleic acid reads and are used in many metagenomic annotation systems. For predicting reading frames on raw reads, we find the hidden Markov model approach in FragGeneScan is more sensitive than other gene prediction tools, while Prodigal, MGA, and MGM are better suited for higher-quality sequences such as assembled contigs.


Subject(s)
Metagenomics/methods , Molecular Sequence Annotation/methods , Reading Frames , Sequence Analysis, DNA/methods , Algorithms , Base Sequence
13.
PLoS Comput Biol ; 8(6): e1002541, 2012.
Article in English | MEDLINE | ID: mdl-22685393

ABSTRACT

We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as "noise" or "error") within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms.


Subject(s)
Metagenomics/statistics & numerical data , Sequence Analysis/statistics & numerical data , Computational Biology , Data Interpretation, Statistical , Genomics/statistics & numerical data , High-Throughput Nucleotide Sequencing/statistics & numerical data , Humans
14.
J Bacteriol ; 194(7): 1835-6, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22408239

ABSTRACT

Halomonas strain GFAJ-1 was reported in Science magazine to be a remarkable microbe for which there was "arsenate in macromolecules that normally contain phosphate, most notably nucleic acids." The draft genome of the bacterium was determined (NCBI accession numbers AHBC01000001 through AHBC01000103). It appears to be a typical gamma proteobacterium.


Subject(s)
Genome, Bacterial , Halomonas/genetics , Base Sequence , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...