Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Syst Biol ; 4: 116, 2010 Aug 18.
Article in English | MEDLINE | ID: mdl-20718955

ABSTRACT

BACKGROUND: Although Escherichia coli is one of the best studied model organisms, a comprehensive understanding of its gene regulation is not yet achieved. There exist many approaches to reconstruct regulatory interaction networks from gene expression experiments. Mutual information based approaches are most useful for large-scale network inference. RESULTS: We used a three-step approach in which we combined gene regulatory network inference based on directed information (DTI) and sequence analysis. DTI values were calculated on a set of gene expression profiles from 19 time course experiments extracted from the Many Microbes Microarray Database. Focusing on influences between pairs of genes in which one partner encodes a transcription factor (TF) we derived a network which contains 878 TF - gene interactions of which 166 are known according to RegulonDB. Afterward, we selected a subset of 109 interactions that could be confirmed by the presence of a phylogenetically conserved binding site of the respective regulator. By this second step, the fraction of known interactions increased from 19% to 60%. In the last step, we checked the 44 of the 109 interactions not yet included in RegulonDB for functional relationships between the regulator and the target and, thus, obtained ten TF - target gene interactions. Five of them concern the regulator LexA and have already been reported in the literature. The remaining five influences describe regulations by Fis (with two novel targets), PhdR, PhoP, and KdgR. For the validation of our approach, one of them, the regulation of lipoate synthase (LipA) by the pyruvate-sensing pyruvate dehydrogenate repressor (PdhR), was experimentally checked and confirmed. CONCLUSIONS: We predicted a set of five novel TF - target gene interactions in E. coli. One of them, the regulation of lipA by the transcriptional regulator PdhR was validated experimentally. Furthermore, we developed DTInfer, a new R-package for the inference of gene-regulatory networks from microarrays using directed information.


Subject(s)
Escherichia coli/genetics , Gene Regulatory Networks , Sequence Analysis, DNA , Systems Biology/methods , Base Sequence , Escherichia coli/metabolism , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Genomics , Reproducibility of Results
2.
Bioinformatics ; 25(19): 2603-4, 2009 Oct 01.
Article in English | MEDLINE | ID: mdl-19605418

ABSTRACT

MOTIVATION: DiProGB is an easy to use new genome browser that encodes the primary nucleotide sequence by thermodynamical and geometrical dinucleotide properties. The nucleotide sequence is thus converted into a sequence graph. This visualization, supported by different graph manipulation options, facilitates genome analyses, because the human brain can process visual information better than textual information. Also, DiProGB can identify genomic regions where certain physical properties are more conserved than the nucleotide sequence itself. Most of the DiProGB tools can be applied to both, the primary nucleotide sequence and the sequence graph. They include motif and repeat searches as well as statistical analyses. DiProGB adds a new dimension to the common genome analysis approaches by taking into account the physical properties of DNA and RNA. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download at http://diprogb.fli-leibniz.de, implemented in C++ and supported on MS Windows and Linux (using e.g. WineHQ).


Subject(s)
Computational Biology/methods , Genome , Genomics/methods , Sequence Analysis, DNA/methods , Sequence Analysis, RNA/methods , Software , Base Sequence , Databases, Genetic , Internet , Nucleotides/chemistry , Sequence Alignment , User-Computer Interface
3.
Nucleic Acids Res ; 37(11): 3569-79, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19359358

ABSTRACT

Alternative splicing (AS) involving NAGNAG tandem acceptors is an evolutionarily widespread class of AS. Recent predictions of alternative acceptor usage reported better results for acceptors separated by larger distances, than for NAGNAGs. To improve the latter, we aimed at the use of Bayesian networks (BN), and extensive experimental validation of the predictions. Using carefully constructed training and test datasets, a balanced sensitivity and specificity of >or=92% was achieved. A BN trained on the combined dataset was then used to make predictions, and 81% (38/47) of the experimentally tested predictions were verified. Using a BN learned on human data on six other genomes, we show that while the performance for the vertebrate genomes matches that achieved on human data, there is a slight drop for Drosophila and worm. Lastly, using the prediction accuracy according to experimental validation, we estimate the number of yet undiscovered alternative NAGNAGs. State of the art classifiers can produce highly accurate prediction of AS at NAGNAGs, indicating that we have identified the major features of the 'NAGNAG-splicing code' within the splice site and its immediate neighborhood. Our results suggest that the mechanism behind NAGNAG AS is simple, stochastic, and conserved among vertebrates and beyond.


Subject(s)
Alternative Splicing , Genomics/methods , RNA Splice Sites , Tandem Repeat Sequences , Animals , Bayes Theorem , Caenorhabditis elegans/genetics , Databases, Nucleic Acid , Electrophoresis, Capillary , Humans , Mice , Rats , Reverse Transcriptase Polymerase Chain Reaction
4.
Nucleic Acids Res ; 37(Database issue): D37-40, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18805906

ABSTRACT

DiProDB (http://diprodb.fli-leibniz.de) is a database of conformational and thermodynamic dinucleotide properties. It includes datasets both for DNA and RNA, as well as for single and double strands. The data have been shown to be important for understanding different aspects of nucleic acid structure and function, and they can also be used for encoding nucleic acid sequences. The database is intended to facilitate further applications of dinucleotide properties. A number of property datasets is highly correlated. Therefore, the database comes with a correlation analysis facility. Authors having determined new sets of dinucleotide property values are invited to submit these data to DiProDB.


Subject(s)
DNA/chemistry , Databases, Nucleic Acid , Dinucleoside Phosphates/chemistry , RNA/chemistry , Nucleic Acid Conformation , Thermodynamics , User-Computer Interface
5.
RNA ; 14(4): 616-29, 2008 Apr.
Article in English | MEDLINE | ID: mdl-18268022

ABSTRACT

Many alternative splice events result in subtle mRNA changes, and most of them occur at short-distance tandem donor and acceptor sites. The splicing mechanism of such tandem sites likely involves the stochastic selection of either splice site. While tandem splice events are frequent, it is unknown how many are functionally important. Here, we use phylogenetic conservation to address this question, focusing on tandems with a distance of 3-9 nucleotides. We show that previous contradicting results on whether alternative or constitutive tandem motifs are more conserved between species can be explained by a statistical paradox (Simpson's paradox). Applying methods that take biases into account, we found higher conservation of alternative tandems in mouse, dog, and even chicken, zebrafish, and Fugu genomes. We estimated a lower bound for the number of alternative sites that are under purifying (negative) selection. While the absolute number of conserved tandem motifs decreases with the evolutionary distance, the fraction under selection increases. Interestingly, a number of frameshifting tandems are under selection, suggesting a role in regulating mRNA and protein levels via nonsense-mediated decay (NMD). An analysis of the intronic flanks shows that purifying selection also acts on the intronic sequence. We propose that stochastic splice site selection can be an advantageous mechanism that allows constant splice variant ratios in situations where a deviation in this ratio is deleterious.


Subject(s)
Alternative Splicing , RNA Splice Sites , Animals , Base Sequence , Chickens , Computational Biology , Conserved Sequence , Databases, Nucleic Acid , Dogs , Expressed Sequence Tags , Humans , Introns , Macaca mulatta , Mice , Phylogeny , RNA, Messenger/genetics , Selection, Genetic , Tandem Repeat Sequences
6.
Nucleic Acids Res ; 35(Web Server issue): W688-93, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17537825

ABSTRACT

BioBayesNet is a new web application that allows the easy modeling and classification of biological data using Bayesian networks. To learn Bayesian networks the user can either upload a set of annotated FASTA sequences or a set of pre-computed feature vectors. In case of FASTA sequences, the server is able to generate a wide range of sequence and structural features from the sequences. These features are used to learn Bayesian networks. An automatic feature selection procedure assists in selecting discriminative features, providing an (locally) optimal set of features. The output includes several quality measures of the overall network and individual features as well as a graphical representation of the network structure, which allows to explore dependencies between features. Finally, the learned Bayesian network or another uploaded network can be used to classify new data. BioBayesNet facilitates the use of Bayesian networks in biological sequences analysis and is flexible to support modeling and classification applications in various scientific fields. The BioBayesNet server is available at http://biwww3.informatik.uni-freiburg.de:8080/BioBayesNet/.


Subject(s)
Algorithms , Computational Biology/methods , Models, Molecular , Pattern Recognition, Automated/methods , Proteins/analysis , Proteins/chemistry , Sequence Analysis, Protein/methods , Amino Acid Sequence , Bayes Theorem , Computer Simulation , Information Storage and Retrieval , Internet , Molecular Sequence Data , Protein Conformation , Protein Folding , Proteins/classification , Structure-Activity Relationship
7.
Nucleic Acids Res ; 35(Database issue): D188-92, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17142241

ABSTRACT

Subtle alternative splice events at tandem splice sites are frequent in eukaryotes and substantially increase the complexity of transcriptomes and proteomes. We have developed a relational database, TassDB (TAndem Splice Site DataBase), which stores extensive data about alternative splice events at GYNGYN donors and NAGNAG acceptors. These splice events are of subtle nature since they mostly result in the insertion/deletion of a single amino acid or the substitution of one amino acid by two others. Currently, TassDB contains 114 554 tandem splice sites of eight species, 5209 of which have EST/mRNA evidence for alternative splicing. In addition, human SNPs that affect NAGNAG acceptors are annotated. The database provides a user-friendly interface to search for specific genes or for genes containing tandem splice sites with specific features as well as the possibility to download large datasets. This database should facilitate further experimental studies and large-scale bioinformatics analyses of tandem splice sites. The database is available at http://helios.informatik.uni-freiburg.de/TassDB/.


Subject(s)
Alternative Splicing , Databases, Nucleic Acid , RNA Splice Sites , Animals , Genomics , Humans , Internet , Mice , Polymorphism, Single Nucleotide , Rats , User-Computer Interface
8.
J Bioinform Comput Biol ; 4(2): 609-20, 2006 Apr.
Article in English | MEDLINE | ID: mdl-16819806

ABSTRACT

We present a new classification scheme of the genetic code. In contrast to the standard form it clearly shows five codon symmetries: codon-anticodon, codon-reverse codon, and sense-antisense symmetry, as well as symmetries with respect to purine-pyrimidine (A versus G, U versus C) and keto-aminobase (G versus U, A versus C) exchanges. We study the number of tRNA genes of 16 archaea, 81 bacteria and 7 eucaryotes to analyze whether these symmetries are reflected in the corresponding tRNA usage patterns. Two features are especially striking: reverse stop codons do not have their own tRNAs (just one exception in human), and A** anticodons are significantly suppressed. Our classification scheme of the genetic code and the identified tRNA usage patterns support recent speculations about the early evolution of the genetic code. In particular, pre-tRNAs might have had the ability to bind their codons in two directions to the corresponding codons.


Subject(s)
Biological Evolution , Chromosome Mapping/methods , Codon/genetics , Evolution, Molecular , Genetic Code/genetics , Models, Genetic , RNA, Transfer/genetics , Computer Simulation
SELECTION OF CITATIONS
SEARCH DETAIL
...