Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
J Bioinform Comput Biol ; 6(6): 1193-211, 2008 Dec.
Article in English | MEDLINE | ID: mdl-19090024

ABSTRACT

Short-insert shotgun sequencing approaches have been applied in recent years to environmental genomic libraries. In the case of complex multispecies microbial communities, there can be many sequence reads that are not incorporated into assemblies, and thus need to be annotated and accessible as single reads. Most existing annotation systems and genome databases accommodate assembled genomes containing contiguous gene-encoding sequences. Thus, a solution is required that can work effectively with environmental genomic annotation information to facilitate data analysis. The Environmental Genome Informational Utility System (EnGenIUS) is a comprehensive environmental genome (metagenome) research toolset that was specifically designed to accommodate the needs of large (> 250 K sequence reads) environmental genome sequencing efforts. The core EnGenIUS modules consist of a set of UNIX scripts and PHP programs used for data preprocessing, an annotation pipeline with accompanying analysis tools, two entity relational databases, and a graphical user interface. The annotation pipeline has a modular structure and can be customized to best fit input data set properties. The integrated entity relational databases store raw data and annotation analysis results. Access to the underlying databases and services is facilitated through a web-based graphical user interface. Users have the ability to browse, upload, download, and analyze preprocessed data, based on diverse search criteria. The EnGenIUS toolset was successfully tested using the Alvinella pompejana epibiont environmental genome data set, which comprises more than 300 K sequence reads. A fully browsable EnGenIUS portal is available at (http://ocean.dbi.udel.edu/) (access code: "guest"). The scope of this paper covers the implementation details and technical aspects of the EnGenIUS toolset.


Subject(s)
Environmental Microbiology , Genetics, Microbial/statistics & numerical data , Software , Computational Biology , Databases, Genetic/statistics & numerical data , Genomic Library , User-Computer Interface
2.
Proc Natl Acad Sci U S A ; 105(45): 17516-21, 2008 Nov 11.
Article in English | MEDLINE | ID: mdl-18987310

ABSTRACT

Hydrothermal vent ecosystems support diverse life forms, many of which rely on symbiotic associations to perform functions integral to survival in these extreme physicochemical environments. Epsilonproteobacteria, found free-living and in intimate associations with vent invertebrates, are the predominant vent-associated microorganisms. The vent-associated polychaete worm, Alvinella pompejana, is host to a visibly dense fleece of episymbionts on its dorsal surface. The episymbionts are a multispecies consortium of Epsilonproteobacteria present as a biofilm. We unraveled details of these enigmatic, uncultivated episymbionts using environmental genome sequencing. They harbor wide-ranging adaptive traits that include high levels of strain variability analogous to Epsilonproteobacteria pathogens such as Helicobacter pylori, metabolic diversity of free-living bacteria, and numerous orthologs of proteins that we hypothesize are each optimally adapted to specific temperature ranges within the 10-65 degrees C fluctuations characteristic of the A. pompejana habitat. This strategic combination enables the consortium to thrive under diverse thermal and chemical regimes. The episymbionts are metabolically tuned for growth in hydrothermal vent ecosystems with genes encoding the complete rTCA cycle, sulfur oxidation, and denitrification; in addition, the episymbiont metagenome also encodes capacity for heterotrophic and aerobic metabolisms. Analysis of the environmental genome suggests that A. pompejana may benefit from the episymbionts serving as a stable source of food and vitamins. The success of Epsilonproteobacteria as episymbionts in hydrothermal vent ecosystems is a product of adaptive capabilities, broad metabolic capacity, strain variance, and virulent traits in common with pathogens.


Subject(s)
Adaptation, Biological/physiology , Energy Metabolism/physiology , Epsilonproteobacteria/genetics , Genomics/methods , Models, Molecular , Polychaeta/microbiology , Symbiosis , Temperature , Animals , Base Sequence , Cluster Analysis , Models, Biological , Molecular Sequence Data , Pacific Ocean , RNA, Ribosomal/genetics , Sequence Analysis, DNA , Species Specificity
3.
Am J Physiol Regul Integr Comp Physiol ; 295(1): R15-27, 2008 Jul.
Article in English | MEDLINE | ID: mdl-18434436

ABSTRACT

Baroreceptor afferents project to the cardiovascular region of the nucleus tractus solitarius (cvNTS), and their cvNTS target neurons may play a role in governing the sensitivity and operating range of the arterial baroreceptor reflex (baroreflexes). Recent studies have shown differential gene and protein expression in the cvNTS in response to changed arterial pressure. However, the extent of these responses is unknown. Therefore, we collected differential global gene expression data in a time series following acute hypertension in awake, freely moving rats. To acquire statistically significant results and place them in functional context, we overcame several quality control requirements and developed novel analytical approaches. The physiologically new findings from the study are that acute hypertension causes very extensive, time-varying gene regulatory changes, many involving neuronal function-specific genes and systems of genes. We use standard genomic analysis methods to manage the large data sets and to develop results such as heat maps to examine patterns and clusters in the gene regulation. We used the Gene Ontology categories to provide functional context. To place our findings in the context of the relevant literature, we developed two graphical representations of the networks implicated, linking receptors and channels to signaling pathways. The results point to the multivariate complexity of the response and implicate a group of receptors as candidates for mediating nucleus tractus solitarius baroreflex function in hypertension by identifying concurrent upregulation of receptor genes. We were able to make transcription factor binding predictions and record dysregulation of heart rate correlated with the transcriptional response.


Subject(s)
Gene Expression Profiling , Hypertension/metabolism , Solitary Nucleus/metabolism , Animals , Gene Expression Regulation/physiology , Male , Rats , Rats, Sprague-Dawley
4.
Bioinformatics ; 18(4): 634-6, 2002 Apr.
Article in English | MEDLINE | ID: mdl-12016062

ABSTRACT

SUMMARY: Tandem Repeat Occurrence Locator (TROLL), is a light-weight Simple Sequence Repeat (SSR) finder based on a slight modification of the Aho-Corasick algorithm. It is fast and only requires a standard Personal Computer (PC) to operate. We report running times of 127 s to find all SSRs of length 20 bp or more on the complete Arabdopsis genome--approx. 130 Mbases divided in five chromosomes--using a PC Athlon 650 MHz with 256 MB of RAM. AVAILABILITY: TROLL is an open source project and is available at http://finder.sourceforge.net.


Subject(s)
Algorithms , Genome, Plant , Microsatellite Repeats/genetics , Sequence Analysis, DNA/methods , Software , Arabidopsis/genetics , Pattern Recognition, Automated , Sensitivity and Specificity , Tandem Repeat Sequences/genetics , User-Computer Interface
5.
Article in English | MEDLINE | ID: mdl-15838144

ABSTRACT

With the development of microarray techniques, there is an increasing need of information processing methods to analyze the high throughput data. Clustering is one of the most promising candidates because of its simplicity, flexibility and robustness. However, there is no "perfect" clustering approach outperforming its counterparts, and it is hard to evaluate and combine the results from different techniques, especially in a field without much prior knowledge, such as bioinformatics. This paper proposes a meta-clustering approach to extract the information from results of different clustering techniques, so that a better interpretation of the data distribution can be obtained. A special distance measure is defined to represent the statistical "signal" of each cluster produced by various clustering techniques. The algorithm is applied on both artificial and real data Simulations show that the proposed approach is able to extract the information efficiently and accurately from the input clustering structure.


Subject(s)
Algorithms , Cluster Analysis , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Pattern Recognition, Automated/methods , Sequence Analysis, DNA/methods , Sequence Alignment/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...