Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters











Publication year range
1.
PLoS Genet ; 18(6): e1010245, 2022 06.
Article in English | MEDLINE | ID: mdl-35657999

ABSTRACT

LOTUS and Tudor domain containing proteins have critical roles in the germline. Proteins that contain these domains, such as Tejas/Tapas in Drosophila, help localize the Vasa helicase to the germ granules and facilitate piRNA-mediated transposon silencing. The homologous proteins in mammals, TDRD5 and TDRD7, are required during spermiogenesis. Until now, proteins containing both LOTUS and Tudor domains in Caenorhabditis elegans have remained elusive. Here we describe LOTR-1 (D1081.7), which derives its name from its LOTUS and Tudor domains. Interestingly, LOTR-1 docks next to P granules to colocalize with the broadly conserved Z-granule helicase, ZNFX-1. The Tudor domain of LOTR-1 is required for its Z-granule retention. Like znfx-1 mutants, lotr-1 mutants lose small RNAs from the 3' ends of WAGO and mutator targets, reminiscent of the loss of piRNAs from the 3' ends of piRNA precursor transcripts in mouse Tdrd5 mutants. Our work shows that LOTR-1 acts with ZNFX-1 to bring small RNA amplifying mechanisms towards the 3' ends of its RNA templates.


Subject(s)
Caenorhabditis elegans , Epigenesis, Genetic , Germ Cells , Animals , Caenorhabditis elegans/genetics , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins , Germ Cells/metabolism , RNA Helicases , RNA, Small Interfering/genetics , RNA, Small Interfering/metabolism , Tudor Domain
2.
Elife ; 102021 07 05.
Article in English | MEDLINE | ID: mdl-34223818

ABSTRACT

We describe MIP-1 and MIP-2, novel paralogous C. elegans germ granule components that interact with the intrinsically disordered MEG-3 protein. These proteins promote P granule condensation, form granules independently of MEG-3 in the postembryonic germ line, and balance each other in regulating P granule growth and localization. MIP-1 and MIP-2 each contain two LOTUS domains and intrinsically disordered regions and form homo- and heterodimers. They bind and anchor the Vasa homolog GLH-1 within P granules and are jointly required for coalescence of MEG-3, GLH-1, and PGL proteins. Animals lacking MIP-1 and MIP-2 show temperature-sensitive embryonic lethality, sterility, and mortal germ lines. Germline phenotypes include defects in stem cell self-renewal, meiotic progression, and gamete differentiation. We propose that these proteins serve as scaffolds and organizing centers for ribonucleoprotein networks within P granules that help recruit and balance essential RNA processing machinery to regulate key developmental transitions in the germ line.


Subject(s)
Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans/metabolism , Germ Cells/physiology , Intracellular Signaling Peptides and Proteins/metabolism , Animals , Caenorhabditis elegans/embryology , Caenorhabditis elegans Proteins/genetics , DEAD-box RNA Helicases/genetics , DEAD-box RNA Helicases/metabolism , Gene Expression Regulation/physiology , Intracellular Signaling Peptides and Proteins/genetics
3.
J Mol Biol ; 433(15): 167051, 2021 07 23.
Article in English | MEDLINE | ID: mdl-33992693

ABSTRACT

The COVID-19 pandemic has triggered concerns about the emergence of more infectious and pathogenic viral strains. As a public health measure, efficient screening methods are needed to determine the functional effects of new sequence variants. Here we show that structural modeling of SARS-CoV-2 Spike protein binding to the human ACE2 receptor, the first step in host-cell entry, predicts many novel variant combinations with enhanced binding affinities. By focusing on natural variants at the Spike-hACE2 interface and assessing over 700 mutant complexes, our analysis reveals that high-affinity Spike mutations (including N440K, S443A, G476S, E484R, G502P) tend to cluster near known human ACE2 recognition sites (K31 and K353). These Spike regions are structurally flexible, allowing certain mutations to optimize interface interaction energies. Although most human ACE2 variants tend to weaken binding affinity, they can interact with Spike mutations to generate high-affinity double mutant complexes, suggesting variation in individual susceptibility to infection. Applying structural analysis to highly transmissible variants, we find that circulating point mutations S477N, E484K and N501Y form high-affinity complexes (~40% more than wild-type). By combining predicted affinities and available antibody escape data, we show that fast-spreading viral variants exploit combinatorial mutations possessing both enhanced affinity and antibody resistance, including S477N/E484K, E484K/N501Y and K417T/E484K/N501Y. Thus, three-dimensional modeling of the Spike/hACE2 complex predicts changes in structure and binding affinity that correlate with transmissibility and therefore can help inform future intervention strategies.


Subject(s)
Angiotensin-Converting Enzyme 2/chemistry , Angiotensin-Converting Enzyme 2/metabolism , COVID-19/transmission , Mutation , SARS-CoV-2/pathogenicity , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/metabolism , Angiotensin-Converting Enzyme 2/genetics , Binding Sites , Computational Biology , Humans , Models, Molecular , Protein Binding , Protein Conformation , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , Spike Glycoprotein, Coronavirus/genetics , Virus Internalization
4.
Nucleic Acids Res ; 47(10): 5307-5324, 2019 06 04.
Article in English | MEDLINE | ID: mdl-30941417

ABSTRACT

Hepatitis C virus (HCV) is a positive-sense RNA virus that interacts with the liver-specific microRNA, miR-122. miR-122 binds to two sites in the 5' untranslated region (UTR) and this interaction promotes HCV RNA accumulation, although the precise role of miR-122 in the HCV life cycle remains unclear. Using biophysical analyses and Selective 2' Hydroxyl Acylation analyzed by Primer Extension (SHAPE) we investigated miR-122 interactions with the 5' UTR. Our data suggests that miR-122 binding results in alteration of nucleotides 1-117 to suppress an alternative secondary structure and promote functional internal ribosomal entry site (IRES) formation. Furthermore, we demonstrate that two hAgo2:miR-122 complexes are able to bind to the HCV 5' terminus simultaneously and SHAPE analyses revealed further alterations to the structure of the 5' UTR to accommodate these complexes. Finally, we present a computational model of the hAgo2:miR-122:HCV RNA complex at the 5' terminus of the viral genome as well as hAgo2:miR-122 interactions with the IRES-40S complex that suggest hAgo2 is likely to form additional interactions with SLII which may further stabilize the HCV IRES. Taken together, our results support a model whereby hAgo2:miR-122 complexes alter the structure of the viral 5' terminus and promote formation of the HCV IRES.


Subject(s)
Argonaute Proteins/metabolism , Genome, Viral , Hepacivirus/genetics , Hepatitis C/virology , MicroRNAs/metabolism , 5' Untranslated Regions , Calorimetry , Humans , Internal Ribosome Entry Sites , Mutation , Nucleic Acid Conformation , Plasmids/metabolism , Protein Binding , RNA Stability , RNA, Viral/genetics , Software , Thermodynamics , Virus Replication
5.
Methods Mol Biol ; 1970: 43-64, 2019.
Article in English | MEDLINE | ID: mdl-30963487

ABSTRACT

Translational repression and degradation of transcripts by microRNAs (miRNAs) is mediated by a ribonucleoprotein complex called the miRNA-induced silencing complex (miRISC, or RISC). Advances in experimental determination of RISC structures have enabled detailed analysis and modeling of known miRNA targets, yet a full appreciation of the structural factors influencing target recognition remains a challenge, primarily because target recognition involves a combination of RNA-RNA and RNA-protein interactions that can vary greatly among different miRNA-target pairs. In this chapter, we review progress toward understanding the role of tertiary structure in miRNA target recognition using computational approaches to assemble RISC complexes at known targets and physics-based methods for computing target interactions. Using this framework to examine RISC structures and dynamics, we describe how the conformational flexibility of Argonautes plays an important role in accommodating the diversity of miRNA-target duplexes formed at canonical and noncanonical target sites. We then discuss applications of tertiary structure-based approaches to emerging topics, including the structural effects of SNPs in miRNA targets and cooperative interactions involving Argonaute-Argonaute complexes. We conclude by assessing the prospects for genome-scale modeling of RISC structures and modeling of higher-order Argonaute complexes associated with miRNA biogenesis, mRNA regulation, and other functions.


Subject(s)
Argonaute Proteins/chemistry , Computational Biology/methods , MicroRNAs/metabolism , RNA, Messenger/metabolism , RNA-Induced Silencing Complex/metabolism , Software , Binding Sites , Gene Expression Regulation , Humans , MicroRNAs/chemistry , MicroRNAs/genetics , Protein Structure, Tertiary , RNA, Messenger/chemistry , RNA, Messenger/genetics , RNA-Induced Silencing Complex/chemistry
6.
Nucleic Acids Res ; 45(12): 7212-7225, 2017 Jul 07.
Article in English | MEDLINE | ID: mdl-28482037

ABSTRACT

Although strong evidence supports the importance of their cooperative interactions, microRNA (miRNA)-binding sites are still largely investigated as functionally independent regulatory units. Here, a survey of alternative 3΄UTR isoforms implicates a non-canonical seedless site in cooperative miRNA-mediated silencing. While required for target mRNA deadenylation and silencing, this site is not sufficient on its own to physically recruit miRISC. Instead, it relies on facilitating interactions with a nearby canonical seed-pairing site to recruit the Argonaute complexes. We further show that cooperation between miRNA target sites is necessary for silencing in vivo in the C. elegans embryo, and for the recruitment of the Ccr4-Not effector complex. Using a structural model of cooperating miRISCs, we identified allosteric determinants of cooperative miRNA-mediated silencing that are required for both embryonic and larval miRNA functions. Our results delineate multiple cooperative mechanisms in miRNA-mediated silencing and further support the consideration of target site cooperation as a fundamental characteristic of miRNA function.


Subject(s)
Caenorhabditis elegans/genetics , Gene Silencing , MicroRNAs/genetics , RNA-Induced Silencing Complex/chemistry , Transcription Factors/chemistry , 3' Untranslated Regions , Alternative Splicing , Animals , Argonaute Proteins/chemistry , Argonaute Proteins/genetics , Argonaute Proteins/metabolism , Base Sequence , Binding Sites , Caenorhabditis elegans/growth & development , Caenorhabditis elegans/metabolism , Embryo, Nonmammalian , MicroRNAs/metabolism , Models, Molecular , Nucleic Acid Conformation , RNA-Induced Silencing Complex/genetics , RNA-Induced Silencing Complex/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism
7.
Nucleic Acids Res ; 43(20): 9613-25, 2015 Nov 16.
Article in English | MEDLINE | ID: mdl-26432829

ABSTRACT

Experimental studies have uncovered a variety of microRNA (miRNA)-target duplex structures that include perfect, imperfect and seedless duplexes. However, non-canonical binding modes from imperfect/seedless duplexes are not well predicted by computational approaches, which rely primarily on sequence and secondary structural features, nor have their tertiary structures been characterized because solved structures to date are limited to near perfect, straight duplexes in Argonautes (Agos). Here, we use structural modeling to examine the role of Ago dynamics in assembling viable eukaryotic miRNA-induced silencing complexes (miRISCs). We show that combinations of low-frequency, global modes of motion of Ago domains are required to accommodate RNA duplexes in model human and C. elegans Ago structures. Models of viable miRISCs imply that Ago adopts variable conformations at distinct target sites that generate distorted, imperfect miRNA-target duplexes. Ago's ability to accommodate a duplex is dependent on the region where structural distortions occur: distortions in solvent-exposed seed and 3'-end regions are less likely to produce steric clashes than those in the central duplex region. Energetic analyses of assembled miRISCs indicate that target recognition is also driven by favorable Ago-duplex interactions. Such structural insights into Ago loading and target recognition mechanisms may provide a more accurate assessment of miRNA function.


Subject(s)
Argonaute Proteins/chemistry , MicroRNAs/chemistry , RNA-Induced Silencing Complex/chemistry , Animals , Argonaute Proteins/metabolism , Bacterial Proteins/chemistry , Caenorhabditis elegans/genetics , Fungal Proteins/chemistry , Humans , MicroRNAs/metabolism , Models, Molecular , Protein Binding , Protein Conformation , RNA-Induced Silencing Complex/metabolism , Thermus thermophilus
8.
RNA ; 19(4): 539-51, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23417009

ABSTRACT

Current computational analysis of microRNA interactions is based largely on primary and secondary structure analysis. Computationally efficient tertiary structure-based methods are needed to enable more realistic modeling of the molecular interactions underlying miRNA-mediated translational repression. We incorporate algorithms for predicting duplex RNA structures, ionic strength effects, duplex entropy and free energy, and docking of duplex-Argonaute protein complexes into a pipeline to model and predict miRNA-target duplex binding energies. To ensure modeling accuracy and computational efficiency, we use an all-atom description of RNA and a continuum description of ionic interactions using the Poisson-Boltzmann equation. Our method predicts the conformations of two constructs of Caenorhabditis elegans let-7 miRNA-target duplexes to an accuracy of ∼3.8 Šroot mean square distance of their NMR structures. We also show that the computed duplex formation enthalpies, entropies, and free energies for eight miRNA-target duplexes agree with titration calorimetry data. Analysis of duplex-Argonaute docking shows that structural distortions arising from single-base-pair mismatches in the seed region influence the activity of the complex by destabilizing both duplex hybridization and its association with Argonaute. Collectively, these results demonstrate that tertiary structure-based modeling of miRNA interactions can reveal structural mechanisms not accessible with current secondary structure-based methods.


Subject(s)
MicroRNAs/chemistry , Nucleic Acid Conformation , RNA, Helminth/chemistry , Animals , Argonaute Proteins/metabolism , Caenorhabditis elegans/genetics , Caenorhabditis elegans/metabolism , Energy Metabolism , Models, Molecular , Nuclear Magnetic Resonance, Biomolecular , Thermus thermophilus/metabolism
9.
Biophys J ; 99(8): 2587-96, 2010 Oct 20.
Article in English | MEDLINE | ID: mdl-20959100

ABSTRACT

Characterizing the ionic distribution around chromatin is important for understanding the electrostatic forces governing chromatin structure and function. Here we develop an electrostatic model to handle multivalent ions and compute the ionic distribution around a mesoscale chromatin model as a function of conformation, number of nucleosome cores, and ionic strength and species using Poisson-Boltzmann theory. This approach enables us to visualize and measure the complex patterns of counterion condensation around chromatin by examining ionic densities, free energies, shielding charges, and correlations of shielding charges around the nucleosome core and various oligonucleosome conformations. We show that: counterions, especially divalent cations, predominantly condense around the nucleosomal and linker DNA, unburied regions of histone tails, and exposed chromatin surfaces; ionic screening is sensitively influenced by local and global conformations, with a wide ranging net nucleosome core screening charge (56-100e); and screening charge correlations reveal conformational flexibility and interactions among chromatin subunits, especially between the histone tails and parental nucleosome cores. These results provide complementary and detailed views of ionic effects on chromatin structure for modest computational resources. The electrostatic model developed here is applicable to other coarse-grained macromolecular complexes.


Subject(s)
Chromatin/chemistry , Models, Molecular , Static Electricity , Chromatin/metabolism , DNA/chemistry , DNA/metabolism , Histones/chemistry , Histones/metabolism , Nucleosomes/chemistry , Nucleosomes/metabolism , Protein Conformation , Protein Folding , Salts/chemistry , Surface Properties
10.
Nucleic Acids Res ; 38(13): e139, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20448026

ABSTRACT

Although identification of active motifs in large random sequence pools is central to RNA in vitro selection, no systematic computational equivalent of this process has yet been developed. We develop a computational approach that combines target pool generation, motif scanning and motif screening using secondary structure analysis for applications to 10(12)-10(14)-sequence pools; large pool sizes are made possible using program redesign and supercomputing resources. We use the new protocol to search for aptamer and ribozyme motifs in pools up to experimental pool size (10(14) sequences). We show that motif scanning, structure matching and flanking sequence analysis, respectively, reduce the initial sequence pool by 6-8, 1-2 and 1 orders of magnitude, consistent with the rare occurrence of active motifs in random pools. The final yields match the theoretical yields from probability theory for simple motifs and overestimate experimental yields, which constitute lower bounds, for aptamers because screening analyses beyond secondary structure information are not considered systematically. We also show that designed pools using our nucleotide transition probability matrices can produce higher yields for RNA ligase motifs than random pools. Our methods for generating, analyzing and designing large pools can help improve RNA design via simulation of aspects of in vitro selection.


Subject(s)
RNA/chemistry , Sequence Analysis, RNA , Algorithms , Aptamers, Nucleotide/chemistry , Carbon-Oxygen Ligases/chemistry , Computational Biology , Nucleic Acid Conformation , RNA, Catalytic/chemistry
11.
J Biomed Sci ; 15(6): 697-705, 2008 Nov.
Article in English | MEDLINE | ID: mdl-18661287

ABSTRACT

Small nucleolar RNAs (snoRNAs) play a significant role in Prader-Willi Syndrome (PWS) and Angelman Syndrome (AS), which are genomic disorders resulting from deletions in the human chromosomal region 15q11-q13. To identify snoRNAs in the region, our computational study employs key motif features of C/D box snoRNAs and introduces a complementary RNA-RNA hybridization test. We identify three previously unknown methylation guide snoRNAs targeting ribosomal 18S and 28S RNAs, and two snoRNAs targeting serotonin receptor 2C mRNA. We show that the three snoRNA candidates likely possess methylation strands complementary to, and form stable complexes with, human ribosomal RNAs. Our screen also identifies 8 other snoRNA candidates that do not pass the rRNA-complementarity and/or hybridization tests. Two of these candidates have extensive sequence similarity to HBII-52, a snoRNA that regulates the alternative splicing of serotonin receptor 2C mRNA. Six out of our eleven candidate snoRNAs are also predicted by other existing methods.


Subject(s)
Angelman Syndrome/genetics , Computational Biology , Genome, Human/genetics , Prader-Willi Syndrome/genetics , RNA, Small Nucleolar/genetics , Algorithms , Base Sequence , Gene Order , Humans , Molecular Sequence Data , Nucleic Acid Conformation , Nucleic Acid Hybridization , Sequence Alignment
12.
Bioinform Biol Insights ; 2: 75-94, 2008 Mar 01.
Article in English | MEDLINE | ID: mdl-19812767

ABSTRACT

Recent studies of mammalian transcriptomes have identified numerous RNA transcripts that do not code for proteins; their identity, however, is largely unknown. Here we explore an approach based on sequence randomness patterns to discern different RNA classes. The relative z-score we use helps identify the known ncRNA class from the genome, intergene and intron classes. This leads us to a fractional ncRNA measure of putative ncRNA datasets which we model as a mixture of genuine ncRNAs and other transcripts derived from genomic, intergenic and intronic sequences. We use this model to analyze six representative datasets identified by the FANTOM3 project and two computational approaches based on comparative analysis (RNAz and EvoFold). Our analysis suggests fewer ncRNAs than estimated by DNA sequencing and comparative analysis, but the verity of our approach and its prediction requires more extensive experimental RNA data.

13.
Bioinformatics ; 23(21): 2959-60, 2007 Nov 01.
Article in English | MEDLINE | ID: mdl-17855416

ABSTRACT

SUMMARY: Our RNA-As-Graph-Pools (RagPools) web server offers a theoretical companion tool for RNA in vitro selection and related problems. Specifically, it suggests how to construct RNA sequence/structure pools with user-specified properties and assists in analyzing resulting distributions. This utility follows our recently developed approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a 'mixing matrix' approach combined with a graph theory analysis of RNA secondary-structure space; the mixing matrix specifies nucleotide transition rates, and graph theory links sequences to simple graphical objects representing RNA motifs. The companion RagPools web server ('Designer' component) provides optimized starting sequences, mixing matrices and associated weights in response to a user-specified target pool structure distribution. In addition, RagPools ('Analyzer' component) analyzes the motif distribution of pools generated from user-specified starting sequences and mixing matrices. Thus, RagPools serves as a guide to researchers who aim to synthesize RNA pools with desired properties and/or experiment in silico with various designs by our approach. AVAILABILITY: The web server is accessible on the web at http://rubin2.biomath.nyu.edu


Subject(s)
Algorithms , Internet , RNA Probes/genetics , Sequence Alignment/methods , Sequence Analysis, RNA/methods , Software , Base Sequence , Molecular Sequence Data
14.
RNA ; 13(4): 478-92, 2007 Apr.
Article in English | MEDLINE | ID: mdl-17322501

ABSTRACT

Although in vitro selection technology is a versatile experimental tool for discovering novel synthetic RNA molecules, finding complex RNA molecules is difficult because most RNAs identified from random sequence pools are simple motifs, consistent with recent computational analysis of such sequence pools. Thus, enriching in vitro selection pools with complex structures could increase the probability of discovering novel RNAs. Here we develop an approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a "mixing matrix" approach combined with a graph theory analysis. We define five classes of mixing matrices motivated by covariance mutations in RNA; these constructs define nucleotide transition rates and are applied to chosen starting sequences to yield specific nonrandom pools. We examine the coverage of sequence space as a function of the mixing matrix and starting sequence via clustering analysis. We show that, in contrast to random sequences, which are associated only with a local region of sequence space, our designed pools, including a structured pool for GTP aptamers, can target specific motifs. It follows that experimental synthesis of designed pools can benefit from using optimized starting sequences, mixing matrices, and pool fractions associated with each of our constructed pools as a guide. Automation of our approach could provide practical tools for pool design applications for in vitro selection of RNAs and related problems.


Subject(s)
Computational Biology , RNA/chemistry , Selection, Genetic , Algorithms , Base Sequence , Cluster Analysis , Computer Simulation , Conserved Sequence , In Vitro Techniques , Models, Chemical , Molecular Sequence Data , Nucleic Acid Conformation , Sequence Analysis, RNA
15.
Nucleic Acids Res ; 33(18): 6057-69, 2005.
Article in English | MEDLINE | ID: mdl-16254081

ABSTRACT

Riboswitches and RNA interference are important emerging mechanisms found in many organisms to control gene expression. To enhance our understanding of such RNA roles, finding small regulatory motifs in genomes presents a challenge on a wide scale. Many simple functional RNA motifs have been found by in vitro selection experiments, which produce synthetic target-binding aptamers as well as catalytic RNAs, including the hammerhead ribozyme. Motivated by the prediction of Piganeau and Schroeder [(2003) Chem. Biol., 10, 103-104] that synthetic RNAs may have natural counterparts, we develop and apply an efficient computational protocol for identifying aptamer-like motifs in genomes. We define motifs from the sequence and structural information of synthetic aptamers, search for sequences in genomes that will produce motif matches, and then evaluate the structural stability and statistical significance of the potential hits. Our application to aptamers for streptomycin, chloramphenicol, neomycin B and ATP identifies 37 candidate sequences (in coding and non-coding regions) that fold to the target aptamer structures in bacterial and archaeal genomes. Further energetic screening reveals that several candidates exhibit energetic properties and sequence conservation patterns that are characteristic of functional motifs. Besides providing candidates for experimental testing, our computational protocol offers an avenue for expanding natural RNA's functional repertoire.


Subject(s)
Genomics/methods , RNA/chemistry , Sequence Analysis, RNA/methods , Algorithms , Base Sequence , Computational Biology/methods , Conserved Sequence , Data Interpretation, Statistical , Genome, Archaeal , Genome, Bacterial , Nucleic Acid Conformation , RNA/genetics , Thermodynamics
16.
RNA ; 11(6): 853-63, 2005 Jun.
Article in English | MEDLINE | ID: mdl-15923372

ABSTRACT

In vitro selection of functional RNAs from large random sequence pools has led to the identification of many ligand-binding and catalytic RNAs. However, the structural diversity in random pools is not well understood. Such an understanding is a prerequisite for designing sequence pools to increase the probability of finding complex functional RNA by in vitro selection techniques. Toward this goal, we have generated by computer five random pools of RNA sequences of length up to 100 nt to mimic experiments and characterized the distribution of associated secondary structural motifs using sets of possible RNA tree structures derived from graph theory techniques. Our results show that such random pools heavily favor simple topological structures: For example, linear stem-loop and low-branching motifs are favored rather than complex structures with high-order junctions, as confirmed by known aptamers. Moreover, we quantify the rise of structural complexity with sequence length and report the dominant class of tree motifs (characterized by vertex number) for each pool. These analyses show not only that random pools do not lead to a uniform distribution of possible RNA secondary topologies; they point to avenues for designing pools with specific simple and complex structures in equal abundance in the goal of broadening the range of functional RNAs discovered by in vitro selection. Specifically, the optimal RNA sequence pool length to identify a structure with x stems is 20x.


Subject(s)
Computational Biology , RNA/chemistry , Computer Simulation , Nucleic Acid Conformation
17.
Nucleic Acids Res ; 33(4): 1384-98, 2005.
Article in English | MEDLINE | ID: mdl-15745998

ABSTRACT

Modular architecture is a hallmark of RNA structures, implying structural, and possibly functional, similarity among existing RNAs. To systematically delineate the existence of smaller topologies within larger structures, we develop and apply an efficient RNA secondary structure comparison algorithm using a newly developed two-dimensional RNA graphical representation. Our survey of similarity among 14 pseudoknots and subtopologies within ribosomal RNAs (rRNAs) uncovers eight pairs of structurally related pseudoknots with non-random sequence matches and reveals modular units in rRNAs. Significantly, three structurally related pseudoknot pairs have functional similarities not previously known: one pair involves the 3' end of brome mosaic virus genomic RNA (PKB134) and the alternative hammerhead ribozyme pseudoknot (PKB173), both of which are replicase templates for viral RNA replication; the second pair involves structural elements for translation initiation and ribosome recruitment found in the viral internal ribosome entry site (PKB223) and the V4 domain of 18S rRNA (PKB205); the third pair involves 18S rRNA (PKB205) and viral tRNA-like pseudoknot (PKB134), which probably recruits ribosomes via structural mimicry and base complementarity. Additionally, we quantify the modularity of 16S and 23S rRNAs by showing that RNA motifs can be constructed from at least 210 building blocks. Interestingly, we find that the 5S rRNA and two tree modules within 16S and 23S rRNAs have similar topologies and tertiary shapes. These modules can be applied to design novel RNA motifs via build-up-like procedures for constructing sequences and folds.


Subject(s)
RNA, Ribosomal/chemistry , RNA/chemistry , Algorithms , Base Sequence , Computational Biology , Computer Graphics , Models, Molecular , Molecular Sequence Data , Nucleic Acid Conformation , RNA, Catalytic/chemistry , RNA, Ribosomal, 16S/chemistry , RNA, Ribosomal, 23S/chemistry , RNA, Viral/chemistry
18.
J Mol Biol ; 341(5): 1129-44, 2004 Aug 27.
Article in English | MEDLINE | ID: mdl-15321711

ABSTRACT

Because the functional repertiore of RNA molecules, like proteins, is closely linked to the diversity of their shapes, uncovering RNA's structural repertoire is vital for identifying novel RNAs, especially in genomic sequences. To help expand the limited number of known RNA families, we use graphical representation and clustering analysis of RNA secondary structures to predict novel RNA topologies and their abundance as a function of size. Representing the essential topological properties of RNA secondary structures as graphs enables enumeration, generation, and prediction of novel RNA motifs. We apply a probabilistic graph-growing method to construct the RNA structure space encompassing the topologies of existing and hypothetical RNAs and cluster all RNA topologies into two groups using topological descriptors and a standard clustering algorithm. Significantly, we find that nearly all existing RNAs fall into one group, which we refer to as "RNA-like"; we consider the other group "non-RNA-like". Our method predicts many candidates for novel RNA secondary topologies, some of which are remarkably similar to existing structures; interestingly, the centroid of the RNA-like group is the tmRNA fold, a pseudoknot having both tRNA-like and mRNA-like functions. Additionally, our approach allows estimation of the relative abundance of pseudoknot and other (e.g. tree) motifs using the "edge-cut" property of RNA graphs. This analysis suggests that pseudoknots dominate the RNA structure universe, representing more than 90% when the sequence length exceeds 120 nt; the predicted trend for <100 nt agrees with data for existing RNAs. Together with our predictions for novel "RNA-like" topologies, our analysis can help direct the design of functional RNAs and identification of novel RNA folds in genomes through an efficient topology-directed search, which grows much more slowly in complexity with RNA size compared to the traditional sequence-based search.


Subject(s)
Models, Theoretical , Nucleic Acid Conformation , RNA/chemistry , Algorithms , Base Sequence , Cluster Analysis , Models, Molecular
19.
BMC Bioinformatics ; 5: 88, 2004 Jul 06.
Article in English | MEDLINE | ID: mdl-15238163

ABSTRACT

BACKGROUND: The proliferation of structural and functional studies of RNA has revealed an increasing range of RNA's structural repertoire. Toward the objective of systematic cataloguing of RNA's structural repertoire, we have recently described the basis of a graphical approach for organizing RNA secondary structures, including existing and hypothetical motifs. DESCRIPTION: We now present an RNA motif database based on graph theory, termed RAG for RNA-As-Graphs, to catalogue and rank all theoretically possible, including existing, candidate and hypothetical, RNA secondary motifs. The candidate motifs are predicted using a clustering algorithm that classifies RNA graphs into RNA-like and non-RNA groups. All RNA motifs are filed according to their graph vertex number (RNA length) and ranked by topological complexity. CONCLUSIONS: RAG's quantitative cataloguing allows facile retrieval of all classes of RNA secondary motifs, assists identification of structural and functional properties of user-supplied RNA sequences, and helps stimulate the search for novel RNAs based on predicted candidate motifs.


Subject(s)
Computer Graphics/trends , Internet/trends , RNA/chemistry , Software , Computational Biology/methods , Databases, Nucleic Acid/trends , Software Design
20.
Biopolymers ; 73(3): 340-7, 2004 Feb 15.
Article in English | MEDLINE | ID: mdl-14755570

ABSTRACT

The various motifs of RNA molecules are closely related to their structural and functional properties. To better understand the nature and distributions of such structural motifs (i.e., paired and unpaired bases in stems, junctions, hairpin loops, bulges, and internal loops) and uncover characteristic features, we analyze the large 16S and 23S ribosomal RNAs of Escherichia coli. We find that the paired and unpaired bases in structural motifs have characteristic distribution shapes and ranges; for example, the frequency distribution of paired bases in stems declines linearly with the number of bases, whereas that for unpaired bases in junctions has a pronounced peak. Significantly, our survey reveals that the ratio of total (over the entire molecule) unpaired to paired bases (0.75) and the fraction of bases in stems (0.6), junctions (0.16), hairpin loops (0.12), and bulges/internal loops (0.12) are shared by 16S and 23S ribosomal RNAs, suggesting that natural RNAs may maintain certain proportions of bases in various motifs to ensure structural integrity. These findings may help in the design of novel RNAs and in the search (via constraints) for RNA-coding motifs in genomes, problems of intense current focus.


Subject(s)
RNA, Ribosomal/chemistry , RNA, Ribosomal/genetics , Base Sequence , Drug Design , Escherichia coli/chemistry , Escherichia coli/genetics , Genomics , Models, Molecular , Molecular Sequence Data , Nucleic Acid Conformation , RNA, Bacterial/chemistry , RNA, Bacterial/genetics , RNA, Ribosomal, 16S/chemistry , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 23S/chemistry , RNA, Ribosomal, 23S/genetics , RNA, Transfer/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL