Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Nat Commun ; 14(1): 6580, 2023 10 18.
Article in English | MEDLINE | ID: mdl-37852981

ABSTRACT

Spliceosomal snRNPs are multicomponent particles that undergo a complex maturation pathway. Human Sm-class snRNAs are generated as 3'-end extended precursors, which are exported to the cytoplasm and assembled together with Sm proteins into core RNPs by the SMN complex. Here, we provide evidence that these pre-snRNA substrates contain compact, evolutionarily conserved secondary structures that overlap with the Sm binding site. These structural motifs in pre-snRNAs are predicted to interfere with Sm core assembly. We model structural rearrangements that lead to an open pre-snRNA conformation compatible with Sm protein interaction. The predicted rearrangement pathway is conserved in Metazoa and requires an external factor that initiates snRNA remodeling. We show that the essential helicase Gemin3, which is a component of the SMN complex, is crucial for snRNA structural rearrangements during snRNP maturation. The SMN complex thus facilitates ATP-driven structural changes in snRNAs that expose the Sm site and enable Sm protein binding.


Subject(s)
RNA Precursors , RNA, Small Nuclear , Humans , RNA, Small Nuclear/metabolism , SMN Complex Proteins/metabolism , RNA Precursors/metabolism , HeLa Cells , Ribonucleoproteins, Small Nuclear/metabolism , snRNP Core Proteins/genetics
2.
Front Microbiol ; 13: 848536, 2022.
Article in English | MEDLINE | ID: mdl-35633709

ABSTRACT

Bacteria employ small non-coding RNAs (sRNAs) to regulate gene expression. Ms1 is an sRNA that binds to the RNA polymerase (RNAP) core and affects the intracellular level of this essential enzyme. Ms1 is structurally related to 6S RNA that binds to a different form of RNAP, the holoenzyme bearing the primary sigma factor. 6S RNAs are widespread in the bacterial kingdom except for the industrially and medicinally important Actinobacteria. While Ms1 RNA was identified in Mycobacterium, it is not clear whether Ms1 RNA is present also in other Actinobacteria species. Here, using a computational search based on secondary structure similarities combined with a linguistic gene synteny approach, we identified Ms1 RNA in Streptomyces. In S. coelicolor, Ms1 RNA overlaps with the previously annotated scr3559 sRNA with an unknown function. We experimentally confirmed that Ms1 RNA/scr3559 associates with the RNAP core without the primary sigma factor HrdB in vivo. Subsequently, we applied the computational approach to other Actinobacteria and identified Ms1 RNA candidates in 824 Actinobacteria species, revealing Ms1 RNA as a widespread class of RNAP binding sRNAs, and demonstrating the ability of our multifactorial computational approach to identify weakly conserved sRNAs in evolutionarily distant genomes.

3.
Bioinformatics ; 37(17): 2755-2756, 2021 Sep 09.
Article in English | MEDLINE | ID: mdl-33523120

ABSTRACT

SUMMARY: We present a web service for improving characterization of non-coding RNAs (ncRNAs) from NCBI BLAST outputs, based on a command-line application rboAnalyzer. Briefly, the application extends subject sequences of selected high scoring pairs (HSPs) in BLAST output to their plausible full length, and predicts their homology and secondary structures. The aim of the application is to aid to characterize subject RNAs in HSPs that come uncharacterized in BLAST output. The main advantages of the web-server are easy use and interactive analysis with search, filtering and data export options. AVAILABILITY AND IMPLEMENTATION: The web server is freely available at rboanalyzer.elixir-czech.cz. The website frontend is implemented in Elm, while backend is implemented in Python and served by Apache.

4.
Front Genet ; 11: 675, 2020.
Article in English | MEDLINE | ID: mdl-32849767

ABSTRACT

Searching for similar sequences in a database via BLAST or a similar tool is one of the most common bioinformatics tasks applied in general, and to non-coding RNAs in particular. However, the results of the search might be difficult to interpret due to the presence of partial matches to the database subject sequences. Here, we present rboAnalyzer - a tool that helps with interpreting sequence search result by (1) extending partial matches into plausible full-length subject sequences, (2) predicting homology of RNAs represented by full-length subject sequences to the query RNA, (3) pooling information across homologous RNAs found in the search results and public databases such as Rfam to predict more reliable secondary structures for all matches, and (4) contextualizing the matches by providing the prediction results and other relevant information in a rich graphical output. Using predicted full-length matches improves secondary structure prediction and makes rboAnalyzer robust with regards to identification of homology. The output of the tool should help the user to reliably characterize non-coding RNAs in BLAST output. The usefulness of the rboAnalyzer and its ability to correctly extend partial matches to full-length is demonstrated on known homologous RNAs. To allow the user to use custom databases and search options, rboAnalyzer accepts any search results as a text file in the BLAST format. The main output is an interactive HTML page displaying the computed characteristics and other context of the matches. The output can also be exported in an appropriate sequence and/or secondary structure formats.

5.
Database (Oxford) ; 20192019 01 01.
Article in English | MEDLINE | ID: mdl-31032840

ABSTRACT

Secondary data structure of RNA molecules provides insights into the identity and function of RNAs. With RNAs readily sequenced, the question of their structural characterization is increasingly important. However, RNA structure is difficult to acquire. Its experimental identification is extremely technically demanding, while computational prediction is not accurate enough, especially for large structures of long sequences. We address this difficult situation with rPredictorDB, a predictive database of RNA secondary structures that aims to form a middle ground between experimentally identified structures in PDB and predicted consensus secondary structures in Rfam. The database contains individual secondary structures predicted using a tool for template-based prediction of RNA secondary structure for the homologs of the RNA families with at least one homolog with experimentally solved structure. Experimentally identified structures are used as the structural templates and thus the prediction has higher reliability than de novo predictions in Rfam. The sequences are downloaded from public resources. So far rPredictorDB covers 7365 RNAs with their secondary structures. Plots of the secondary structures use the Traveler package for readable display of RNAs with long sequences and complex structures, such as ribosomal RNAs. The RNAs in the output of rPredictorDB are extensively annotated and can be viewed, browsed, searched and downloaded according to taxonomic, sequence and structure data. Additionally, structure of user-provided sequences can be predicted using the templates stored in rPredictorDB.


Subject(s)
Databases, Nucleic Acid , Nucleic Acid Conformation , RNA , Software , RNA/chemistry , RNA/genetics
6.
Bioinformatics ; 35(7): 1231-1233, 2019 04 01.
Article in English | MEDLINE | ID: mdl-30169571

ABSTRACT

SUMMARY: We present the cpPredictor webserver that implements a novel template-based method for prediction of secondary structure of RNA. The method outperforms available prediction methods as it uses RNA structures of related molecules, either predicted or experimentally identified, as structural templates. The server aims at three major tasks: i) prediction of RNA secondary structures that are difficult to predict by available methods, ii) characterization of uncharacterized RNAs as compatible or incompatible with a chosen template structure and iii) an identification of the most relevant structure among different candidate structures of a single RNA ambiguously predicted by available methods. The web server is accompanied with a comprehensive documentation. AVAILABILITY AND IMPLEMENTATION: The web server is freely available at http://cppredictor.elixir-czech.cz/. The source code of the cpPredictor algorithm is freely available from the webserver under the Apache License, Version 2.0.


Subject(s)
Nucleic Acid Conformation , Software , Algorithms , Internet , Protein Structure, Secondary , RNA , Sequence Analysis, RNA
7.
Nucleic Acids Res ; 46(7): 3774-3790, 2018 04 20.
Article in English | MEDLINE | ID: mdl-29415178

ABSTRACT

Cajal bodies (CBs) are nuclear non-membrane bound organelles where small nuclear ribonucleoprotein particles (snRNPs) undergo their final maturation and quality control before they are released to the nucleoplasm. However, the molecular mechanism how immature snRNPs are targeted and retained in CBs has yet to be described. Here, we microinjected and expressed various snRNA deletion mutants as well as chimeric 7SK, Alu or bacterial SRP non-coding RNAs and provide evidence that Sm and SMN binding sites are necessary and sufficient for CB localization of snRNAs. We further show that Sm proteins, and specifically their GR-rich domains, are important for accumulating snRNPs in CBs. Accordingly, core snRNPs containing the Sm proteins, but not naked snRNAs, restore the formation of CBs after their depletion. Finally, we show that immature but not fully assembled snRNPs are able to induce CB formation and that microinjection of an excess of U2 snRNP-specific proteins, which promotes U2 snRNP maturation, chases U2 snRNA from CBs. We propose that the accessibility of the Sm ring represents the molecular basis for the quality control of the final maturation of snRNPs and the sequestration of immature particles in CBs.


Subject(s)
Cell Nucleus/genetics , RNA, Small Nuclear/genetics , Ribonucleoprotein, U2 Small Nuclear/genetics , Spliceosomes/genetics , Coiled Bodies/genetics , Coiled Bodies/metabolism , Gene Expression Regulation/genetics , HeLa Cells , Humans
8.
Front Genet ; 8: 147, 2017.
Article in English | MEDLINE | ID: mdl-29067038

ABSTRACT

While understanding the structure of RNA molecules is vital for deciphering their functions, determining RNA structures experimentally is exceptionally hard. At the same time, extant approaches to computational RNA structure prediction have limited applicability and reliability. In this paper we provide a method to solve a simpler yet still biologically relevant problem: prediction of secondary RNA structure using structure of different molecules as a template. Our method identifies conserved and unconserved subsequences within an RNA molecule. For conserved subsequences, the template structure is directly transferred into the generated structure and combined with de-novo predicted structure for the unconserved subsequences with low evolutionary conservation. The method also determines, when the generated structure is unreliable. The method is validated using experimentally identified structures. The accuracy of the method exceeds that of classical prediction algorithms and constrained prediction methods. This is demonstrated by comparison using large number of heterogeneous RNAs. The presented method is fast and robust, and useful for various applications requiring knowledge of secondary structures of individual RNA sequences.

9.
RNA Biol ; 14(12): 1660-1667, 2017 12 02.
Article in English | MEDLINE | ID: mdl-28745933

ABSTRACT

Reinitiation after translation of short upstream ORFs (uORFs) represents one of the means of regulation of gene expression on the mRNA-specific level in response to changing environmental conditions. Over the years it has been shown-mainly in budding yeast-that its efficiency depends on cis-acting features occurring in sequences flanking reinitiation-permissive uORFs, the nature of their coding sequences, as well as protein factors acting in trans. We earlier demonstrated that the first two uORFs from the reinitiation-regulated yeast GCN4 mRNA leader carry specific structural elements in their 5' sequences that interact with the translation initiation factor eIF3 to prevent full ribosomal recycling post their translation. Actually, this interaction turned out to be instrumental in stabilizing the mRNA·40S post-termination complex, which is thus capable to eventually resume scanning and reinitiate on the next AUG start site downstream. Recently, we also provided important in vivo evidence strongly supporting the long-standing idea that to stimulate reinitiation, eIF3 has to remain bound to ribosomes elongating these uORFs until their stop codon has been reached. Here we examined the importance of eIF3 and sequences flanking uORF1 of the human functional homolog of yeast GCN4, ATF4, in stimulation of efficient reinitiation. We revealed that the molecular basis of the reinitiation mechanism is conserved between yeasts and humans.


Subject(s)
Eukaryotic Initiation Factor-3/metabolism , Open Reading Frames , Peptide Chain Initiation, Translational , Activating Transcription Factor 4/chemistry , Activating Transcription Factor 4/metabolism , Animals , Eukaryotic Initiation Factor-3/chemistry , Humans , Mammals , Protein Biosynthesis , RNA, Messenger/genetics , RNA, Messenger/metabolism , Ribosomes/metabolism
10.
RNA ; 22(7): 957-67, 2016 07.
Article in English | MEDLINE | ID: mdl-27190231

ABSTRACT

Nucleic acid sequence complementarity underlies many fundamental biological processes. Although first noticed a long time ago, sequence complementarity between mRNAs and ribosomal RNAs still lacks a meaningful biological interpretation. Here we used statistical analysis of large-scale sequence data sets and high-throughput computing to explore complementarity between 18S and 28S rRNAs and mRNA 3' UTR sequences. By the analysis of 27,646 full-length 3' UTR sequences from 14 species covering both protozoans and metazoans, we show that the computed 18S rRNA complementarity creates an evolutionarily conserved localization pattern centered around the ribosomal mRNA entry channel, suggesting its biological relevance and functionality. Based on this specific pattern and earlier data showing that post-termination 80S ribosomes are not stably anchored at the stop codon and can migrate in both directions to codons that are cognate to the P-site deacylated tRNA, we propose that the 18S rRNA-mRNA complementarity selectively stabilizes post-termination ribosomal complexes to facilitate ribosome recycling. We thus demonstrate that the complementarity between 18S rRNA and 3' UTRs has a non-random nature and very likely carries information with a regulatory potential for translational control.


Subject(s)
3' Untranslated Regions , Protein Biosynthesis/physiology , RNA, Ribosomal/physiology , Terminator Regions, Genetic , Animals , Codon , RNA, Ribosomal/chemistry
11.
Nucleic Acids Res ; 42(18): 11763-76, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25217589

ABSTRACT

Small RNAs (sRNAs) are molecules essential for a number of regulatory processes in the bacterial cell. Here we characterize Ms1, a sRNA that is highly expressed in Mycobacterium smegmatis during stationary phase of growth. By glycerol gradient ultracentrifugation, RNA binding assay, and RNA co-immunoprecipitation, we show that Ms1 interacts with the RNA polymerase (RNAP) core that is free of the primary sigma factor (σA) or any other σ factor. This contrasts with the situation in most other species where it is 6S RNA that interacts with RNAP and this interaction requires the presence of σA. The difference in the interaction of the two types of sRNAs (Ms1 or 6S RNA) with RNAP possibly reflects the difference in the composition of the transcriptional machinery between mycobacteria and other species. Unlike Escherichia coli, stationary phase M. smegmatis cells contain relatively few RNAP molecules in complex with σA. Thus, Ms1 represents a novel type of small RNAs interacting with RNAP.


Subject(s)
DNA-Directed RNA Polymerases/metabolism , Mycobacterium smegmatis/genetics , RNA, Small Untranslated/metabolism , Chromosomes, Bacterial , Mycobacterium/genetics , Mycobacterium smegmatis/enzymology , Mycobacterium smegmatis/growth & development , Nucleic Acid Conformation , RNA, Small Untranslated/chemistry , RNA, Small Untranslated/genetics , Sigma Factor/metabolism , Synteny
12.
Nucleic Acids Res ; 41(16): 7625-34, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23804757

ABSTRACT

There are several key mechanisms regulating eukaryotic gene expression at the level of protein synthesis. Interestingly, the least explored mechanisms of translational control are those that involve the translating ribosome per se, mediated for example via predicted interactions between the ribosomal RNAs (rRNAs) and mRNAs. Here, we took advantage of robustly growing large-scale data sets of mRNA sequences for numerous organisms, solved ribosomal structures and computational power to computationally explore the mRNA-rRNA complementarity that is statistically significant across the species. Our predictions reveal highly specific sequence complementarity of 18S rRNA sequences with mRNA 5' untranslated regions (UTRs) forming a well-defined 3D pattern on the rRNA sequence of the 40S subunit. Broader evolutionary conservation of this pattern may imply that 5' UTRs of eukaryotic mRNAs, which have already emerged from the mRNA-binding channel, may contact several complementary spots on 18S rRNA situated near the exit of the mRNA binding channel and on the middle-to-lower body of the solvent-exposed 40S ribosome including its left foot. We discuss physiological significance of this structurally conserved pattern and, in the context of previously published experimental results, propose that it modulates scanning of the 40S subunit through 5' UTRs of mRNAs.


Subject(s)
5' Untranslated Regions , Evolution, Molecular , Gene Expression Regulation , Protein Biosynthesis , RNA, Ribosomal, 18S/chemistry , Animals , Base Sequence , Cattle , Conserved Sequence , Humans , RNA, Messenger/chemistry , RNA, Ribosomal, 28S/chemistry , Rats , Ribosome Subunits, Small, Eukaryotic/chemistry
13.
PLoS Genet ; 7(7): e1002137, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21750682

ABSTRACT

Reinitiation is a gene-specific translational control mechanism characterized by the ability of some short upstream uORFs to retain post-termination 40S subunits on mRNA. Its efficiency depends on surrounding cis-acting sequences, uORF elongation rates, various initiation factors, and the intercistronic distance. To unravel effects of cis-acting sequences, we investigated previously unconsidered structural properties of one such a cis-enhancer in the mRNA leader of GCN4 using yeast genetics and biochemistry. This leader contains four uORFs but only uORF1, flanked by two transferrable 5' and 3' cis-acting sequences, and allows efficient reinitiation. Recently we showed that the 5' cis-acting sequences stimulate reinitiation by interacting with the N-terminal domain (NTD) of the eIF3a/TIF32 subunit of the initiation factor eIF3 to stabilize post-termination 40S subunits on uORF1 to resume scanning downstream. Here we identify four discernible reinitiation-promoting elements (RPEs) within the 5' sequences making up the 5' enhancer. Genetic epistasis experiments revealed that two of these RPEs operate in the eIF3a/TIF32-dependent manner. Likewise, two separate regions in the eIF3a/TIF32-NTD were identified that stimulate reinitiation in concert with the 5' enhancer. Computational modeling supported by experimental data suggests that, in order to act, the 5' enhancer must progressively fold into a specific secondary structure while the ribosome scans through it prior uORF1 translation. Finally, we demonstrate that the 5' enhancer's stimulatory activity is strictly dependent on and thus follows the 3' enhancer's activity. These findings allow us to propose for the first time a model of events required for efficient post-termination resumption of scanning. Strikingly, structurally similar RPE was predicted and identified also in the 5' leader of reinitiation-permissive uORF of yeast YAP1. The fact that it likewise operates in the eIF3a/TIF32-dependent manner strongly suggests that at least in yeasts the underlying mechanism of reinitiation on short uORFs is conserved.


Subject(s)
Eukaryotic Initiation Factor-3 , Open Reading Frames/genetics , RNA, Messenger , Ribosome Subunits, Small, Eukaryotic/metabolism , Ribosomes , Saccharomyces cerevisiae Proteins , 5' Flanking Region , 5' Untranslated Regions , Base Sequence , Basic-Leucine Zipper Transcription Factors/genetics , Basic-Leucine Zipper Transcription Factors/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Enhancer Elements, Genetic , Eukaryotic Initiation Factor-3/genetics , Eukaryotic Initiation Factor-3/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Regulatory Sequences, Nucleic Acid , Ribosomal Proteins/genetics , Ribosomal Proteins/metabolism , Ribosome Subunits, Small, Eukaryotic/genetics , Ribosomes/genetics , Ribosomes/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism
14.
Nucleic Acids Res ; 39(8): 3418-26, 2011 Apr.
Article in English | MEDLINE | ID: mdl-21193488

ABSTRACT

Non-coding RNAs (ncRNAs) are regulatory molecules encoded in the intergenic or intragenic regions of the genome. In prokaryotes, biocomputational identification of homologs of known ncRNAs in other species often fails due to weakly evolutionarily conserved sequences, structures, synteny and genome localization, except in the case of evolutionarily closely related species. To eliminate results from weak conservation, we focused on RNA structure, which is the most conserved ncRNA property. Analysis of the structure of one of the few well-studied bacterial ncRNAs, 6S RNA, demonstrated that unlike optimal and consensus structures, suboptimal structures are capable of capturing RNA homology even in divergent bacterial species. A computational procedure for the identification of homologous ncRNAs using suboptimal structures was created. The suggested procedure was applied to strongly divergent bacterial species and was capable of identifying homologous ncRNAs.


Subject(s)
RNA, Bacterial/chemistry , RNA, Untranslated/chemistry , Base Sequence , Molecular Sequence Data , Mycobacterium/genetics , Nucleic Acid Conformation , Sequence Homology, Nucleic Acid , Streptomyces/genetics
15.
Nucleic Acids Res ; 38(14): 4579-85, 2010 Aug.
Article in English | MEDLINE | ID: mdl-20371515

ABSTRACT

Post-transcriptional control of mRNA by micro-RNAs (miRNAs) represents an important mechanism of gene regulation. miRNAs act by binding to the 3' untranslated region (3'UTR) of an mRNA, affecting the stability and translation of the target mRNA. Here, we present a numerical model of miRNA-mediated mRNA downregulation and its application to analysis of temporal microarray data of HepG2 cells transfected with miRNA-124a. Using the model our analysis revealed a novel mechanism of mRNA accumulation control by miRNA, predicting that specific mRNAs are controlled in a digital, switch-like manner. Specifically, the contribution of miRNAs to mRNA degradation is switched from maximum to zero in a very short period of time. Such behaviour suggests a model of control in which mRNA is at a certain moment protected from binding of miRNA and further accumulates with a basal rate. Genes associated with this process were identified and parameters of the model for all miRNA-124a affected mRNAs were computed.


Subject(s)
Gene Silencing , MicroRNAs/metabolism , Models, Genetic , RNA, Messenger/metabolism , Cell Line , Down-Regulation , Humans , Oligonucleotide Array Sequence Analysis , RNA Stability
16.
J Theor Biol ; 254(2): 301-7, 2008 Sep 21.
Article in English | MEDLINE | ID: mdl-18621060

ABSTRACT

Various sources of protein data, such as knowledgebases and scientific literature, are currently available, as are numerous tools for their analysis. The matter becomes one of choosing the tools that are most appropriate for the specific task and for the specific proteins. A combination of standard and alternative tools may lead to biologically significant results. Here, a computational classification of proteins is made using standard multiple sequence alignment in combination with an alternative method for analysis of hydropathy distribution in proteins. Both of these methods are applied to the Na+/Cl--dependent neurotransmitter symporters (NSSs), resulting in two alternative classifications. The classifications are validated and interpreted biologically by literature and knowledgebase annotation mining, producing a consensus classification. The classification leads to the identification and functional characterization of three families of largely structurally and functionally uncharacterized orphan NSSs. The literature and knowledgebase annotations are mined to functionally characterize the NSSs in these families. The presented work also demonstrates that, in specific cases, the analysis of the hydropathy distribution in proteins is capable of revealing functional properties of proteins.


Subject(s)
Computational Biology/methods , Plasma Membrane Neurotransmitter Transport Proteins/classification , Animals , Databases, Protein , Hydrophobic and Hydrophilic Interactions , Knowledge Bases , Plasma Membrane Neurotransmitter Transport Proteins/metabolism , Protein Interaction Mapping/classification , Sequence Alignment , Sequence Analysis, Protein/methods
17.
BMC Genomics ; 9: 217, 2008 May 13.
Article in English | MEDLINE | ID: mdl-18477385

ABSTRACT

BACKGROUND: The first systematic study of small non-coding RNAs (sRNA, ncRNA) in Streptomyces is presented. Except for a few exceptions, the Streptomyces sRNAs, as well as the sRNAs in other genera of the Actinomyces group, have remained unstudied. This study was based on sequence conservation in intergenic regions of Streptomyces, localization of transcription termination factors, and genomic arrangement of genes flanking the predicted sRNAs. RESULTS: Thirty-two potential sRNAs in Streptomyces were predicted. Of these, expression of 20 was detected by microarrays and RT-PCR. The prediction was validated by a structure based computational approach. Two predicted sRNAs were found to be terminated by transcription termination factors different from the Rho-independent terminators. One predicted sRNA was identified computationally with high probability as a Streptomyces 6S RNA. Out of the 32 predicted sRNAs, 24 were found to be structurally dissimilar from known sRNAs. CONCLUSION: Streptomyces is the largest genus of Actinomyces, whose sRNAs have not been studied. The Actinomyces is a group of bacterial species with unique genomes and phenotypes. Therefore, in Actinomyces, new unique bacterial sRNAs may be identified. The sequence and structural dissimilarity of the predicted Streptomyces sRNAs demonstrated by this study serve as the first evidence of the uniqueness of Actinomyces sRNAs.


Subject(s)
RNA, Bacterial/genetics , RNA, Untranslated/genetics , Streptomyces/genetics , Algorithms , Base Sequence , Computational Biology , DNA, Intergenic , Genome, Bacterial , Models, Molecular , Nucleic Acid Conformation , Oligonucleotide Array Sequence Analysis , RNA, Bacterial/chemistry , RNA, Untranslated/chemistry , Reverse Transcriptase Polymerase Chain Reaction , Species Specificity , Streptomyces coelicolor/genetics , Terminator Regions, Genetic
18.
Mol Membr Biol ; 24(4): 304-12, 2007.
Article in English | MEDLINE | ID: mdl-17520486

ABSTRACT

A novel alignment-free method for computing functional similarity of membrane proteins based on features of hydropathy distribution is presented. The features of hydropathy distribution are used to represent protein families as hydropathy profiles. The profiles statistically summarize the hydropathy distribution of member proteins. The summation is made by using hydropathy features that numerically represent structurally/functionally significant portions of protein sequences. The hydropathy profiles are numerical vectors that are points in a high dimensional 'hydropathy' space. Their similarities are identified by projection of the space onto principal axes. Here, the approach is applied to the secondary transporters. The analysis using the presented approach is validated by the standard classification of the secondary transporters. The presented analysis allows for prediction of function attributes for proteins of uncharacterized families of secondary transporters. The results obtained using the presented analysis may help to characterize unknown function attributes of secondary transporters. They also show that analysis of hydropathy distribution can be used for function prediction of membrane proteins.


Subject(s)
Hydrophobic and Hydrophilic Interactions , Models, Molecular , Proteins/chemistry , Amino Acid Sequence , Carrier Proteins , Computational Biology , Membrane Proteins , Proteins/physiology
19.
Proteins ; 58(4): 923-34, 2005 Mar 01.
Article in English | MEDLINE | ID: mdl-15645428

ABSTRACT

Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment.


Subject(s)
Membrane Proteins/chemistry , Peptides/chemistry , Proteins/chemistry , Proteomics/methods , Algorithms , Amino Acid Sequence , Amino Acids/chemistry , Cathepsins/chemistry , Cell Membrane/metabolism , Cluster Analysis , Databases, Protein , Models, Statistical , Molecular Sequence Data , Protein Conformation , Protein Structure, Secondary , Protein Structure, Tertiary , Sequence Alignment , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...