ABSTRACT
This article provides an overview of studied RNA editing cases and the examples of RNA editing investigations that apply to different experimental data. Typical RNA editing site prediction errors and the methods to minimize them are shown. The outlook of up-to-date technologies and further RNA editing studies are discussed.
Subject(s)
RNA EditingABSTRACT
Basic bioinformatical analysis of the draft Euplotes crassus macronuclear genome and transcriptome suggests that more than a quarter of E. crassus genes contain several exons. A large fraction of all introns is formed by "tiny" introns having length 20-30 bp. Analysis of the transcriptome revealed 63 possible cases of alternative splicing, and also 14 introns with non-standard splicing sites. About 2000 hypothetical genes do not have homologs in other ciliates, and since most of them have the closest homologs in bacterial genomes, they are likely an artifact of the sample preparation. Comparison of the E. crassus genome to the genomes of other ciliates showed an expansion of the same gene families, responsible for the free-living heterotrophic lifestyle.
Subject(s)
Ciliophora/genetics , Genes, Protozoan/physiology , Genome, Protozoan/physiology , Introns/physiology , Macronucleus/genetics , Alternative Splicing/physiology , Sequence Analysis, DNA/methods , Transcription, Genetic/physiologyABSTRACT
We studied 1372 LacI-family transcription factors and their 4484 DNA binding sites using machine learning algorithms and feature selection techniques. The Naive Bayes classifier and Logistic Regression were used to predict binding sites given transcription factor sequences and to classify factor-site pairs on binding and non-binding ones. Prediction accuracy was estimated using 10-fold cross-validation. Experiments showed that the best prediction of nucleotide densities at selected site positions is obtained using only a few key protein sequence positions. These positions are stably selected by the forward feature selection based on the mutual information of factor-site position pairs.
Subject(s)
Artificial Intelligence , DNA/metabolism , Lac Repressors/metabolism , Sequence Analysis, DNA/methods , Algorithms , Bayes Theorem , Binding Sites , Computational Biology , DNA/chemistry , Data Interpretation, Statistical , Databases, Protein , Lac Repressors/chemistry , Protein Binding , Sequence Alignment , Sequence Analysis, DNA/statistics & numerical dataABSTRACT
BltR is a MerR family transcriptional factor, experimentally characterized in Bacillus subtilis. It activates transcription of genes encoding multidrug transporter Blt and spermine/spermidine acetyltransferase BltD. Here we studied BltR dependent regulons in 25 bacterial genomes using the comparative genomic approach. The structure of the promoter regions of regulated genes is typical for MerR family activators: the binding sites are located in long spacers between promoter elements. Regulated genes are usually co-localized with regulator genes and are divergently transcribed with them. The studied transcriptional factors regulate the transcription of multidrug transporter and spermine/spermidine acetyltransferase genes. These transporters can be either secondary or ATP-dependent. The phylogenetic analysis demonstrated that their role as multidrug transporters is conserved.
Subject(s)
Bacterial Proteins/metabolism , DNA-Binding Proteins/metabolism , Drug Resistance, Multiple, Bacterial/genetics , Gene Expression Regulation, Bacterial , Genome, Bacterial , Regulon/genetics , Trans-Activators/metabolism , Acetyltransferases/genetics , Acetyltransferases/metabolism , Bacillus subtilis/genetics , Bacterial Proteins/genetics , Binding Sites/genetics , Computational Biology , DNA-Binding Proteins/genetics , Multigene Family/genetics , Phylogeny , Promoter Regions, Genetic/genetics , Trans-Activators/geneticsSubject(s)
Alternative Splicing/physiology , Phosphopeptides/genetics , Databases, Protein , HeLa Cells , Humans , PhosphorylationSubject(s)
Aphids/metabolism , Aphids/microbiology , Buchnera/metabolism , Symbiosis/physiology , Animals , Aphids/genetics , Buchnera/geneticsABSTRACT
Microarrays are widely used for gene expression profiling. In the case of prokaryotes such arrays usually provide data about composition of modulons, groups of genes whose expression is influenced by a single regulatory system or external stimulus. Unlike modulons, regulons include only genes directly controlled by regulatory systems. Here we compared the structures of the Fnr and ArcA modulons and regulons. The data about modulon composition were taken from published microarray assays, whereas regulons were characterized using comparative genomic approaches. The Fnr and ArcA regulons were shown to contain 26 and 16 operons, respectively. Ten operons had high-score and highly conserved site for both Fnr and ArcA. These genes are the "core of regulons". Remarkably, all "core genes" encode enzymes involved in aerobic respiration and central metabolism. The Fnr-ArcA regulatory cascade plays an important role in expansion of the Fnr modulon.
Subject(s)
Bacterial Proteins/biosynthesis , Enterobacteriaceae/metabolism , Genome, Bacterial , Operon , Regulon , Bacterial Outer Membrane Proteins/biosynthesis , Bacterial Outer Membrane Proteins/genetics , Bacterial Proteins/genetics , Energy Metabolism , Enterobacteriaceae/genetics , Escherichia coli/genetics , Escherichia coli/metabolism , Escherichia coli Proteins/biosynthesis , Escherichia coli Proteins/genetics , Iron-Sulfur Proteins/biosynthesis , Iron-Sulfur Proteins/genetics , Oligonucleotide Array Sequence Analysis , Repressor Proteins/biosynthesis , Repressor Proteins/geneticsABSTRACT
The current available data on protein sequences largely exceeds the experimental capabilities to annotate their function. So annotation in silico, i.e. using computational methods becomes increasingly important. This annotation is inevitably a prediction, but it can be an important starting point for further experimental studies. Here we present a method for prediction of protein functional sites, SDPsite, based on the identification of protein specificity determinants. Taking as an input a protein sequence alignment and a phylogenetic tree, the algorithm predicts conserved positions and specificity determinants, maps them onto the protein's 3D structure, and searches for clusters of the predicted positions. Comparison of the obtained predictions with experimental data and data on performance of several other methods for prediction of functional sites reveals that SDPsite agrees well with the experiment and outperforms most of the previously available methods. SDPsite is publicly available under http://bioinf.fbb.msu.ru/SDPsite.
Subject(s)
Algorithms , Models, Molecular , Phylogeny , Proteins/metabolism , Sequence Analysis, Protein , Software , Computational Biology , Databases, Protein , Proteins/chemistry , Sensitivity and Specificity , Sequence Alignment , Sequence Homology, Amino AcidABSTRACT
Methionine is an essential amino acid and the universal N-terminal amino acid of proteins. The biosynthesis of methionine is extensively studied in various organisms that could be used in biotechnological production of methionine. Transcriptional regulation of the methionine synthesis in the Corynebacterium glutamicum genome is well studied. The McbR protein is a transcriptional regulator of methionine/cysteine biosynthesis genes. The operon structures for members of the McbR regulon also were predicted. We identified candidate regulatory proteins orthologous to McbR in six additional genomes of the order Actinomycetales. The McbR regulon as well as regulons of the orthologous regulators in related genomes were analyzed using the comparative genomics methods. The obtained data demonstrated multiple and diverse positional rearrangements even in the closest genomes despite early observations of the high level of genome stability in corynebacteria. Moreover, the comparison of the operon structures combined with the regulatory signal analysis allowed us to predict some new potential members of the McbR regulons, such as glutamine amidotransferase and methylthioadenosine nuclease.
Subject(s)
Corynebacterium glutamicum/metabolism , Cysteine/biosynthesis , Genome, Bacterial/physiology , Methionine/biosynthesis , Operon/physiology , Repressor Proteins/metabolism , Corynebacterium glutamicum/genetics , Cysteine/genetics , Methionine/genetics , Phylogeny , Repressor Proteins/geneticsABSTRACT
SOS-response system is a cascade of reactions induced by DNA damage in a cell. Genes participate in these reactions are regulated by the LexA protein binding to specific sequence in their upstream regions. The criterion for selection of genes putatively responsible for the SOS-response is the presence of such sequence. Genes with taxon-specific regulation in Enterobacteriales, Pasteurellales, Vibrionales, Pseudomonadales and Alteromonadales were analyzed using comparative genomic approaches. Some genes have conserved sites in regulatory region and suitable function, although their function in SOS-response has not been studied in experiment. The list of such genes includes mfd, which encodes a product repairing the mother chain in case of DNA damage-caused transcription stop; VC0082, which encodes a recombinase, and VP2449, responsible for xenobiotics resistance. Overall, this study characterized the content and evolution of the LexA regulon in gamma-proteobacteria are described here.
Subject(s)
DNA Damage/genetics , Gene Expression Regulation, Bacterial/physiology , Genes, Bacterial , Gram-Negative Bacteria/genetics , Response Elements/genetics , SOS Response, Genetics/genetics , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Gram-Negative Bacteria/metabolism , Serine Endopeptidases/genetics , Serine Endopeptidases/metabolism , Transcription, Genetic/physiologyABSTRACT
We searched for new members of the TnrA and GlnR regulons controlling assimilation of nitrogen in gram-positive bacteria. We identified the regulatory signals for these transcription factors with consensuses ATGTNAWWWWWWWTNACAT for GlnR and TGTNAWWWWWWWTNACA for TnrA. We described the structure and found new potential members for the TnrA/GlnR regulons in Bacillus subtilis, B. licheniformis, Geobacillus kaustophilus, Oceanobacillus iheyensis, for the TnrA regulon in B. halodurans and for the GlnR regulons in Lactococcus lactis, Lactobacillus plantarum, Streptococcus pyogenes, S. pneumoniae, S. mutans, S. agalactiae, Enterococcus faecalis, Listeria monocytogenes, Staphylococcus aureus and St. epidermidis.
Subject(s)
Gene Expression Regulation, Bacterial , Gram-Positive Bacteria/metabolism , Nitrogen/metabolism , Amino Acid Sequence , Bacillus subtilis/genetics , Bacillus subtilis/metabolism , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Genome, Bacterial , Gram-Positive Bacteria/genetics , Molecular Sequence Data , Operon/genetics , Regulon/genetics , Signal Transduction , Trans-Activators/genetics , Trans-Activators/metabolismABSTRACT
Seven hundred and ninety Drosophila melanogaster genes, alternatively spliced in coding regions were considered together with their Drosophila pseudoobscura orthologs. It was found that nucleotide substitutions in alternative coding regions accumulate more intensively than in constitutive regions. Moreover, the evolutionary pattern of alternative regions depends on their inclusion mechanisms (use of alternative promoters, splicing sites or polyadenylation sites) significantly. The rate of synonymous substitutions varies is more dramatically than that of nonsynonymous substitutions. Nucleotide substitution patterns in different classes of alternative regions of mammalian and Drosophila genes have little in common.
Subject(s)
Alternative Splicing/genetics , Evolution, Molecular , Genes, Insect/genetics , Open Reading Frames/genetics , Animals , Drosophila melanogaster , Mammals/genetics , Polyadenylation/genetics , Promoter Regions, Genetic/genetics , RNA 3' Polyadenylation Signals/genetics , RNA Splice Sites/geneticsABSTRACT
EDAS, a database of alternatively spliced human genes, contains data on the alignment of proteins, mRNAs, and EST. It contains information on all exons and introns observed, as well as elementary alternatives formed from them. The database makes it possible to filter the output data by changing the cut-off threshold by the significance level. The database is accessible at http://www.gene-bee.msu.ru/edas/.
Subject(s)
Alternative Splicing/genetics , Databases, Nucleic Acid , Computational Biology , Exons/genetics , Expressed Sequence Tags , Humans , Introns/geneticsABSTRACT
The review considers mechanism of bacterial gene expression based on formation of alternative RNA structures, such as riboswitches, attenuators, T-boxes, etc. These structures are classified by mechanism of action. Evolution and interaction of regulatory systems are discussed.
Subject(s)
Bacteria/metabolism , RNA, Bacterial/chemistry , Aptamers, Nucleotide , Bacteria/genetics , Evolution, Molecular , Gene Expression Regulation, Bacterial , Nucleic Acid Conformation , RNA, Bacterial/metabolism , RNA-Binding Proteins/metabolism , Regulatory Sequences, Ribonucleic AcidABSTRACT
Nitrate and nitrite are preferred respiration oxidants during anaerobic conditions. In Escherichia coli such nitrate- and nitrite respiration is controlled by homologous transcriptional factors NarL and NarP. Although this system was intensively studied during the last two decades, the exact mechanisms of regulation and the structure of the NarL binding signals remained elusive. By the use of comparative genomics approach it was determined that most of the gammaproteobacteria contained only NarP protein. Regulog analysis revealed that whole structure of NarP regulons varied in different genomes and only regulation of nitrate and nitrite reduction system seemed to be highly conservative. Correlation between changes in the respiration system and the presence of the single regulatory system was shown. Conservative NarP binding sites upstream of fnr gene and genes for aerobic metabolism point to alteration in NarP role in respiration control during evolution. Thirty five new regulog members were determined and autoregulation of narQP operon in Vibrionaceae genomes was predicted.
Subject(s)
Bacterial Proteins/genetics , Gammaproteobacteria/genetics , Gene Expression Regulation, Bacterial , Genome, Bacterial , Nitrates/metabolism , Nitrites/metabolism , Amino Acid Sequence , Bacterial Proteins/classification , DNA-Binding Proteins/genetics , Enterobacteriaceae/genetics , Escherichia coli Proteins/genetics , Evolution, Molecular , Gammaproteobacteria/metabolism , Genes, Bacterial , Genomics , Molecular Sequence Data , Oxygen Consumption , Pasteurellaceae/genetics , Phylogeny , Vibrionaceae/geneticsABSTRACT
This review of the original works on computer analysis of the human genome considers the development of methods to predict the exon-intron structure of genes and analysis of alternative splicing. Prediction of the gene structure is based on homology between the gene product and a known protein or between the genomic sequences of the gene and its homolog from another organism. The methods were tested and proved highly efficient. Human gene splicing was analyzed with original methods and EST databases. Genes with alternative splicing were for the first time shown to account for no less than 35% total genes. Alternative splicing was compared for the human and mouse genomes. Species-specific isoforms were demonstrated for 50% alternatively spliced genes (25% total genes).
Subject(s)
Exons , Genetics, Medical , Introns , Algorithms , Alternative Splicing , Animals , Databases, Genetic , Expressed Sequence Tags , Humans , MiceABSTRACT
Comparative computer-assisted analysis was used to study putative GlpR-regulons responsible for metabolism of glycerol and glycerol-3-phosphate in genomes of alpha-, beta-, and gamma-proteobacteria. New palindromic GlpR-binding signals were identified in gamma-proteobacteria; consensus sequences being TGTTCGATAACGAACA for Enterobacteriaceae, wTTTTCGTATACGAAAAw for Pseudomonadaceae, and AATGCTCGATCGAGCATT for Vibrionaceae. The signals in alpha- and beta-proteobacteria were also identified: they contained 3-4 direct TTTCGTT repeats separated by 3-4 nucleotide pairs.
Subject(s)
Genome, Bacterial , Glycerophosphates/metabolism , Proteobacteria/genetics , Base Sequence , DNA Primers , Phylogeny , Repetitive Sequences, Nucleic AcidABSTRACT
Expression of many bacterial genes is regulated by formation of alternative secondary RNA structure within the leader mRNA sequence. Our algorithm designed to search for these structures (basing on analysis of one nucleotide sequence) was applied to analyze operons of amino acid biosynthesis in alpha- and gamma-proteobacteria. The attenuators of these operons are predicted for genomes of some poorly known gamma-proteobacteria including Shewanella putrefaciens, attenuators of the tryptophan operon in some alpha-proteobacteria are also predicted.
Subject(s)
Gene Expression Regulation, Bacterial , Nucleic Acid Conformation , RNA, Bacterial/chemistry , RNA, Messenger/chemistry , Base Sequence , Molecular Sequence Data , Sequence Homology, Nucleic Acid , Shewanella putrefaciens/geneticsABSTRACT
We suggest a new procedure to search for the genes with horizontal transfer events in their evolutionary history. The search is based on analysis of topology difference between the phylogenetic trees of gene (protein) groups and the corresponding phylogenetic species trees. Numeric values are introduced to measure the discrepancy between the trees. This approach was applied to analyze 40 prokaryotic genomes classified into 132 classes of orthologs. This resulted in a list of the candidate genes for which the hypothesis of horizontal transfer in evolution looks true.
Subject(s)
Algorithms , Gene Transfer, Horizontal , Models, Genetic , Phylogeny , Archaeoglobus fulgidus/genetics , Buchnera/genetics , Deinococcus , Escherichia coli/genetics , Evolution, Molecular , Halobacterium/geneticsABSTRACT
A cryptic plasmid from a soil strain of Bacillus subtilis was found to contain a sequence having features of IS element. Homologous sequences were also found in the chromosome of this strain and in the chromosomes of some other B. subtilis strains.