Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
RNA ; 30(1): 1-15, 2023 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-37903545

RESUMO

We present a novel framework enhancing the prediction of whether novel lineage poses the threat of eventually dominating the viral population. The framework is based purely on genomic sequence data, without requiring prior established biological analysis. Its building blocks are sets of coevolving sites in the alignment (motifs), identified via coevolutionary signals. The collection of such motifs forms a relational structure over the polymorphic sites. Motifs are constructed using distances quantifying the coevolutionary coupling of pairs and manifest as coevolving clusters of sites. We present an approach to genomic surveillance based on this notion of relational structure. Our system will issue an alert regarding a lineage, based on its contribution to drastic changes in the relational structure. We then conduct a comprehensive retrospective analysis of the COVID-19 pandemic based on SARS-CoV-2 genomic sequence data in GISAID from October 2020 to September 2022, across 21 lineages and 27 countries with weekly resolution. We investigate the performance of this surveillance system in terms of its accuracy, timeliness, and robustness. Lastly, we study how well each lineage is classified by such a system.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/genética , Pandemias , Estudos Retrospectivos , Genômica
2.
J Math Biol ; 86(3): 34, 2023 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-36695949

RESUMO

We propose a novel mathematical paradigm for the study of genetic variation in sequence alignments. This framework originates from extending the notion of pairwise relations, upon which current analysis is based on, to k-ary dissimilarity. This dissimilarity naturally leads to a generalization of simplicial complexes by endowing simplices with weights, compatible with the boundary operator. We introduce the notion of k-stances and dissimilarity complex, the former encapsulating arithmetic as well as topological structure expressing these k-ary relations. We study basic mathematical properties of dissimilarity complexes and show how this approach captures watershed moments of viral dynamics in the context of SARS-CoV-2 and H1N1 flu genomic data.


Assuntos
COVID-19 , Vírus da Influenza A Subtipo H1N1 , Humanos , SARS-CoV-2/genética , Alinhamento de Sequência
3.
NAR Genom Bioinform ; 4(2): lqac037, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35664803

RESUMO

tRNA fragments (tRFs) are small RNAs comparable to the size and function of miRNAs. tRFs are generally Dicer independent, are found associated with Ago, and can repress expression of genes post-transcriptionally. Given that this expands the repertoire of small RNAs capable of post-transcriptional gene expression, it is important to predict tRF targets with confidence. Some attempts have been made to predict tRF targets, but are limited in the scope of tRF classes used in prediction or limited in feature selection. We hypothesized that established miRNA target prediction features applied to tRFs through a random forest machine learning algorithm will immensely improve tRF target prediction. Using this approach, we show significant improvements in tRF target prediction for all classes of tRFs and validate our predictions in two independent cell lines. Finally, Gene Ontology analysis suggests that among the tRFs conserved between mice and humans, the predicted targets are enriched significantly in neuronal function, and we show this specifically for tRF-3009a. These improvements to tRF target prediction further our understanding of tRF function broadly across species and provide avenues for testing novel roles for tRFs in biology. We have created a publicly available website for the targets of tRFs predicted by tRForest.

4.
Algorithms Mol Biol ; 16(1): 7, 2021 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-34074304

RESUMO

BACKGROUND: Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a given RNA structure, play an important role in the context of neutral evolution. Sequences that are simultaneously compatible with two given structures (bicompatible sequences), are beacons in phenotypic transitions, induced by erroneously replicating populations of RNA sequences. RNA riboswitches, which are capable of expressing two distinct secondary structures without changing the underlying sequence, are one example of bicompatible sequences in living organisms. RESULTS: We present a full loop energy model Boltzmann sampler of bicompatible sequences for pairs of structures. The sequence sampler employs a dynamic programming routine whose time complexity is polynomial when assuming the maximum number of exposed vertices, [Formula: see text], is a constant. The parameter [Formula: see text] depends on the two structures and can be very large. We introduce a novel topological framework encapsulating the relations between loops that sheds light on the understanding of [Formula: see text]. Based on this framework, we give an algorithm to sample sequences with minimum [Formula: see text] on a particular topologically classified case as well as giving hints to the solution in the other cases. As a result, we utilize our sequence sampler to study some established riboswitches. CONCLUSION: Our analysis of riboswitch sequences shows that a pair of structures needs to satisfy key properties in order to facilitate phenotypic transitions and that pairs of random structures are unlikely to do so. Our analysis observes a distinct signature of riboswitch sequences, suggesting a new criterion for identifying native sequences and sequences subjected to evolutionary pressure. Our free software is available at: https://github.com/FenixHuang667/Bifold .

5.
Hum Mol Genet ; 30(12): 1101-1110, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-33856031

RESUMO

The smallest genomic region causing Prader-Willi Syndrome (PWS) deletes the non-coding RNA SNORD116 cluster; however, the function of SNORD116 remains a mystery. Previous work in the field revealed the tantalizing possibility that expression of NHLH2, a gene previously implicated in both obesity and hypogonadism, was downregulated in PWS patients and differentiated stem cells. In silico RNA: RNA modeling identified several potential interaction domains between SNORD116 and NHLH2 mRNA. One of these interaction domains was highly conserved in most vertebrate NHLH2 mRNAs examined. A construct containing the Nhlh2 mRNA, including its 3'-UTR, linked to a c-myc tag was transfected into a hypothalamic neuron cell line in the presence and absence of exogenously-expressed Snord116. Nhlh2 mRNA expression was upregulated in the presence of Snord116 dependent on the length and type of 3'UTR used on the construct. Furthermore, use of actinomycin D to stop new transcription in N29/2 cells demonstrated that the upregulation occurred through increased stability of the Nhlh2 mRNA in the 45 minutes immediately following transcription. In silico modeling also revealed that a single nucleotide variant (SNV) in the NHLH2 mRNA could reduce the predicted interaction strength of the NHLH2:SNORD116 diad. Indeed, use of an Nhlh2 mRNA construct containing this SNV significantly reduces the ability of Snord116 to increase Nhlh2 mRNA levels. For the first time, these data identify a motif and mechanism for SNORD116-mediated regulation of NHLH2, clarifying the mechanism by which deletion of the SNORD116 snoRNAs locus leads to PWS phenotypes.


Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Síndrome de Prader-Willi/genética , Proteínas Proto-Oncogênicas c-myc/genética , RNA Nucleolar Pequeno/genética , Animais , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Hipotálamo/metabolismo , Hipotálamo/patologia , Camundongos , Neurônios/metabolismo , Neurônios/patologia , Síndrome de Prader-Willi/metabolismo , Síndrome de Prader-Willi/patologia , Processamento Pós-Transcricional do RNA/genética , Estabilidade de RNA/genética
6.
J Comput Biol ; 28(3): 248-256, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33275493

RESUMO

COVID-19 is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The viral genome is considered to be relatively stable and the mutations that have been observed and reported thus far are mainly focused on the coding region. This article provides evidence that macrolevel pandemic dynamics, such as social distancing, modulate the genomic evolution of SARS-CoV-2. This view complements the prevalent paradigm that microlevel observables control macrolevel parameters such as death rates and infection patterns. First, we observe differences in mutational signals for geospatially separated populations such as the prevalence of A23404G in CA versus NY and WA. We show that the feedback between macrolevel dynamics and the viral population can be captured employing a transfer entropy framework. Second, we observe complex interactions within mutational clades. Namely, when C14408T first appeared in the viral population, the frequency of A23404G spiked in the subsequent week. Third, we identify a noncoding mutation, G29540A, within the segment between the coding gene of the N protein and the ORF10 gene, which is largely confined to NY (>95%). These observations indicate that macrolevel sociobehavioral measures have an impact on the viral genomics and may be useful for the dashboard-like tracking of its evolution. Finally, despite the fact that SARS-CoV-2 is a genetically robust organism, our findings suggest that we are dealing with a high degree of adaptability. Owing to its ample spread, mutations of unusual form are observed and a high complexity of mutational interaction is exhibited.


Assuntos
COVID-19/virologia , Evolução Molecular , Genoma Viral , SARS-CoV-2/genética , COVID-19/epidemiologia , COVID-19/transmissão , Biologia Computacional , Frequência do Gene , Comportamentos Relacionados com a Saúde , Política de Saúde , Humanos , Modelos Genéticos , Mutação , Pandemias , Filogenia , Distanciamento Físico , SARS-CoV-2/patogenicidade , SARS-CoV-2/fisiologia , Glicoproteína da Espícula de Coronavírus/genética
7.
RNA ; 25(12): 1592-1603, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31548338

RESUMO

Genetic robustness, the preservation of evolved phenotypes against genotypic mutations, is one of the central concepts in evolution. In recent years a large body of work has focused on the origins, mechanisms, and consequences of robustness in a wide range of biological systems. In particular, research on ncRNAs studied the ability of sequences to maintain folded structures against single-point mutations. In these studies, the structure is merely a reference. However, recent work revealed evidence that structure itself contributes to the genetic robustness of ncRNAs. We follow this line of thought and consider sequence-structure pairs as the unit of evolution and introduce the spectrum of extended mutational robustness (EMR spectrum) as a measurement of genetic robustness. Our analysis of the miRNA let-7 family captures key features of structure-modulated evolution and facilitates the study of robustness against multiple-point mutations.


Assuntos
MicroRNAs/genética , Mutação/genética , Animais , Evolução Molecular , Genótipo , Humanos , Modelos Genéticos , Conformação de Ácido Nucleico , Fenótipo
8.
J Comput Biol ; 26(3): 173-192, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30653353

RESUMO

Recently, a framework considering RNA sequences and their RNA secondary structures as pairs led to some information-theoretic perspectives on how the semantics encoded in RNA sequences can be inferred. This pairing arises naturally from the energy model of RNA secondary structures. Fixing the sequence in the pairing produces the RNA energy landscape, whose partition function was discovered by McCaskill. Dually, fixing the structure induces the energy landscape of sequences. The latter has been considered originally for designing more efficient inverse folding algorithms and subsequently enhanced by facilitating the sampling of sequences. We present here a partition function of sequence/structure pairs, with endowed Hamming distance and base pair distance filtration. This partition function is an augmentation of the previous mentioned (dual) partition function. We develop an efficient dynamic programming routine to recursively compute the partition function with this double filtration. Our framework is capable of dealing with RNA secondary structures as well as 1-structures, where a 1-structure is an RNA pseudoknot structure consisting of "building blocks" of genus 0 or 1. In particular, 0-structures, consisting of only "building blocks" of genus 0, are exactly RNA secondary structures. The time complexity for calculating the partition function of 1-pairs, that is, sequence/structure pairs where the structures are 1-structures, is O(h3b3n6), where h, b, n denote the Hamming distance, base pair distance, and sequence length, respectively. The time complexity for the partition function of 0-pairs is O(h2b2n3).


Assuntos
Algoritmos , Dobramento de RNA , RNA/química , Análise de Sequência de RNA/métodos , Simulação de Dinâmica Molecular , Motivos de Nucleotídeos
9.
J Comput Biol ; 25(11): 1179-1192, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30133328

RESUMO

Recently, a framework considering ribonucleic acid (RNA) sequences and their RNA secondary structures as pairs has led to new information theoretic perspectives on how the semantics encoded in RNA sequences can be inferred. In this context, the pairing arises naturally from the energy model of RNA secondary structures. Fixing the sequence in the pairing produces the RNA energy landscape, whose partition function was discovered by McCaskill. Dually, fixing the structure induces the energy landscape of sequences. The latter has been considered for designing more efficient inverse folding algorithms. In this work, we present the dual partition function filtered by Hamming distance, together with a Boltzmann sampler using novel dynamic programming routines for the loop-based energy model. The time complexity of the algorithm is [Formula: see text], where [Formula: see text] are Hamming distance and sequence length, respectively, reducing the time complexity of samplers, reported in the literature by [Formula: see text]. We then present two applications, the first in the context of the evolution of natural sequence-structure pairs of microRNAs and the second in constructing neutral paths. The former studies the inverse folding rate (IFR) of sequence-structure pairs, filtered by Hamming distance, observing that such pairs evolve toward higher levels of robustness, that is, increasing IFR. The latter is an algorithm that constructs neutral paths: given two sequences in a neutral network, we employ the sampler to construct short paths connecting them, consisting of sequences all contained in the neutral network.


Assuntos
Algoritmos , Biologia Computacional/métodos , RNA/química , Sequência de Bases , Humanos , Modelos Moleculares , Conformação de Ácido Nucleico
10.
Bioinformatics ; 33(3): 382-389, 2017 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-28171628

RESUMO

Motivation: DNA data is transcribed into single-stranded RNA, which folds into specific molecular structures. In this paper we pose the question to what extent sequence- and structure-information correlate. We view this correlation as structural semantics of sequence data that allows for a different interpretation than conventional sequence alignment. Structural semantics could enable us to identify more general embedded 'patterns' in DNA and RNA sequences. Results: We compute the partition function of sequences with respect to a fixed structure and connect this computation to the mutual information of a sequence­structure pair for RNA secondary structures. We present a Boltzmann sampler and obtain the a priori probability of specific sequence patterns. We present a detailed analysis for the three PDB-structures, 2JXV (hairpin), 2N3R (3-branch multi-loop) and 1EHZ (tRNA). We localize specific sequence patterns, contrast the energy spectrum of the Boltzmann sampled sequences versus those sequences that refold into the same structure and derive a criterion to identify native structures. We illustrate that there are multiple sequences in the partition function of a fixed structure, each having nearly the same mutual information, that are nevertheless poorly aligned. This indicates the possibility of the existence of relevant patterns embedded in the sequences that are not discoverable using alignments. Availability and Implementation: The source code is freely available at http://staff.vbi.vt.edu/fenixh/Sampler.zip Contact: duckcr@vbi.vt.edu Supplimentary Information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Análise de Sequência de RNA/métodos , Software , Algoritmos , Probabilidade , RNA/metabolismo
11.
Math Biosci ; 282: 109-120, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27773681

RESUMO

In this paper we introduce a novel, context-free grammar, RNAFeatures*, capable of generating any RNA structure including pseudoknot structures (pk-structure). We represent pk-structures as orientable fatgraphs, which naturally leads to a filtration by their topological genus. Within this framework, RNA secondary structures correspond to pk-structures of genus zero. RNAFeatures* acts on formal, arc-labeled RNA secondary structures, called λ-structures. λ-structures correspond one-to-one to pk-structures together with some additional information. This information consists of the specific rearrangement of the backbone, by which a pk-structure can be made cross-free. RNAFeatures* is an extension of the grammar for secondary structures and employs an enhancement by labelings of the symbols as well as the production rules. We discuss how to use RNAFeatures* to obtain a stochastic context-free grammar for pk-structures, using data of RNA sequences and structures. The induced grammar facilitates fast Boltzmann sampling and statistical analysis. As a first application, we present an O(nlog (n)) runtime algorithm which samples pk-structures based on ninety tRNA sequences and structures from the Nucleic Acid Database (NDB). AVAILABILITY: the source code for simulation results is available at http://staff.vbi.vt.edu/fenixh/TPstructure.zip. The code is written in C and compiled by Xcode.


Assuntos
Modelos Teóricos , RNA/química
12.
Math Biosci ; 270(Pt A): 57-65, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26482318

RESUMO

A topological RNA structure is derived by fattening the edges of a contact structure into ribbons. The shape of a topological RNA structure is obtained by collapsing the stacks of the structure into single arcs and by removing any arcs of length one, as well as isolated vertices. A shape contains the key topological information of the molecular conformation and for fixed topological genus there exist only finitely many such shapes. In this paper we compute the generating polynomial of shapes of fixed topological genus g. We furthermore derive an algorithm having O(glog g) time complexity uniformly generating shapes of genus g and discuss some applications in the context of databases of RNA pseudoknot structures.


Assuntos
Conformação de Ácido Nucleico , RNA/química , Algoritmos , Bases de Dados de Ácidos Nucleicos , Conceitos Matemáticos , Modelos Moleculares
13.
Math Biosci ; 245(2): 216-25, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23900061

RESUMO

In this paper we present a sampling framework for RNA structures of fixed topological genus. We introduce a novel, linear time, uniform sampling algorithm for RNA structures of fixed topological genus g, for arbitrary g>0. Furthermore we develop a linear time sampling algorithm for RNA structures of fixed topological genus g that are weighted by a simplified, loop-based energy functional. For this process the partition function of the energy functional has to be computed once, which has O(n(2)) time complexity.


Assuntos
Conformação de Ácido Nucleico , RNA/química , Algoritmos , Biologia Computacional , Conceitos Matemáticos , Modelos Moleculares
14.
Algorithms Mol Biol ; 7(1): 28, 2012 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-23088372

RESUMO

BACKGROUND: We study the sparsification of dynamic programming based on folding algorithms of RNA structures. Sparsification is a method that improves significantly the computation of minimum free energy (mfe) RNA structures. RESULTS: We provide a quantitative analysis of the sparsification of a particular decomposition rule, Λ∗. This rule splits an interval of RNA secondary and pseudoknot structures of fixed topological genus. Key for quantifying sparsifications is the size of the so called candidate sets. Here we assume mfe-structures to be specifically distributed (see Assumption 1) within arbitrary and irreducible RNA secondary and pseudoknot structures of fixed topological genus. We then present a combinatorial framework which allows by means of probabilities of irreducible sub-structures to obtain the expectation of the Λ∗-candidate set w.r.t. a uniformly random input sequence. We compute these expectations for arc-based energy models via energy-filtered generating functions (GF) in case of RNA secondary structures as well as RNA pseudoknot structures. Furthermore, for RNA secondary structures we also analyze a simplified loop-based energy model. Our combinatorial analysis is then compared to the expected number of Λ∗-candidates obtained from the folding mfe-structures. In case of the mfe-folding of RNA secondary structures with a simplified loop-based energy model our results imply that sparsification provides a significant, constant improvement of 91% (theory) to be compared to an 96% (experimental, simplified arc-based model) reduction. However, we do not observe a linear factor improvement. Finally, in case of the "full" loop-energy model we can report a reduction of 98% (experiment). CONCLUSIONS: Sparsification was initially attributed a linear factor improvement. This conclusion was based on the so called polymer-zeta property, which stems from interpreting polymer chains as self-avoiding walks. Subsequent findings however reveal that the O(n) improvement is not correct. The combinatorial analysis presented here shows that, assuming a specific distribution (see Assumption 1), of mfe-structures within irreducible and arbitrary structures, the expected number of Λ∗-candidates is Θ(n2). However, the constant reduction is quite significant, being in the range of 96%. We furthermore show an analogous result for the sparsification of the Λ∗-decomposition rule for RNA pseudoknotted structures of genus one. Finally we observe that the effect of sparsification is sensitive to the employed energy model.

15.
J Comput Biol ; 19(7): 928-43, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22731621

RESUMO

The topological filtration of interacting RNA complexes is studied, and the role is analyzed of certain diagrams called irreducible shadows, which form suitable building blocks for more general structures. We prove that, for two interacting RNAs, called interaction structures, there exist for fixed genus only finitely many irreducible shadows. This implies that, for fixed genus, there are only finitely many classes of interaction structures. In particular, the simplest case of genus zero already provides the formalism for certain types of structures that occur in nature and are not covered by other filtrations. This case of genus zero interaction structures is already of practical interest, is studied here in detail, and is found to be expressed by a multiple context-free grammar that extends the usual one for RNA secondary structures. We show that, in O(n(6)) time and O(n(4)) space complexity, this grammar for genus zero interaction structures provides not only minimum free energy solutions but also the complete partition function and base pairing probabilities.


Assuntos
Algoritmos , Conformação de Ácido Nucleico , RNA/química , Modelos Teóricos , Termodinâmica
17.
Bioinformatics ; 27(8): 1076-85, 2011 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-21335320

RESUMO

MOTIVATION: Several dynamic programming algorithms for predicting RNA structures with pseudoknots have been proposed that differ dramatically from one another in the classes of structures considered. RESULTS: Here, we use the natural topological classification of RNA structures in terms of irreducible components that are embeddable in the surfaces of fixed genus. We add to the conventional secondary structures four building blocks of genus one in order to construct certain structures of arbitrarily high genus. A corresponding unambiguous multiple context-free grammar provides an efficient dynamic programming approach for energy minimization, partition function and stochastic sampling. It admits a topology-dependent parametrization of pseudoknot penalties that increases the sensitivity and positive predictive value of predicted base pairs by 10-20% compared with earlier approaches. More general models based on building blocks of higher genus are also discussed. AVAILABILITY: The source code of gfold is freely available at http://www.combinatorics.cn/cbpc/gfold.tar.gz. CONTACT: duck@santafe.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
RNA/química , Algoritmos , Pareamento de Bases , Conformação de Ácido Nucleico , RNA/classificação , Análise de Sequência de RNA , Software
18.
Bioinformatics ; 26(2): 175-81, 2010 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-19910305

RESUMO

MOTIVATION: It has been proven that the accessibility of the target sites has a critical influence on RNA-RNA binding, in general and the specificity and efficiency of miRNAs and siRNAs, in particular. Recently, O(N(6)) time and O(N(4)) space dynamic programming (DP) algorithms have become available that compute the partition function of RNA-RNA interaction complexes, thereby providing detailed insights into their thermodynamic properties. RESULTS: Modifications to the grammars underlying earlier approaches enables the calculation of interaction probabilities for any given interval on the target RNA. The computation of the 'hybrid probabilities' is complemented by a stochastic sampling algorithm that produces a Boltzmann weighted ensemble of RNA-RNA interaction structures. The sampling of k structures requires only negligible additional memory resources and runs in O(k.N(3)). AVAILABILITY: The algorithms described here are implemented in C as part of the rip package. The source code of rip2 can be downloaded from http://www.combinatorics.cn/cbpc/rip.html and http://www.bioinf.uni-leipzig.de/Software/rip.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Modelos Estatísticos , RNA/química , Sítios de Ligação , Biologia Computacional/métodos , Bases de Dados Genéticas , MicroRNAs/química , MicroRNAs/metabolismo , Modelos Moleculares , Conformação de Ácido Nucleico , RNA/metabolismo , RNA Interferente Pequeno/química , RNA Interferente Pequeno/metabolismo , Software
19.
J Comput Biol ; 16(11): 1549-75, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19958083

RESUMO

In this article, we present the novel ab initio folding algorithm cross, which generates minimum free energy (mfe), 3-noncrossing, canonical RNA structures. Here an RNA structure is 3-noncrossing if it does not contain three or more mutually crossing arcs and canonical, if each of its stacks has size greater or equal than two. Our notion of mfe-structure is based on a specific concept of pseudoknots and respective loop-based energy parameters. The algorithm decomposes into three subroutines: first the inductive construction of motifs and their associated shadows, second the generation of the (rooted) skeleta-trees and third the saturation of the skeleta via context dependent dynamic programming routines.


Assuntos
Conformação de Ácido Nucleico , RNA/química , Algoritmos , RNA de Transferência/química , Sequências Reguladoras de Ácido Ribonucleico/genética
20.
Bioinformatics ; 25(20): 2646-54, 2009 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-19671692

RESUMO

MOTIVATION: The RNA-RNA interaction problem (RIP) consists in finding the energetically optimal structure of two RNA molecules that bind to each other. The standard model allows secondary structures in both partners as well as additional base pairs between the two RNAs subject to certain restrictions that ensure that RIP is solvabale by a polynomial time dynamic programming algorithm. RNA-RNA binding, like RNA folding, is typically not dominated by the ground state structure. Instead, a large ensemble of alternative structures contributes to the interaction thermodynamics. RESULTS: We present here an O(N(6)) time and O(N(4)) dynamics programming algorithm for computing the full partition function for RIP which is based on the combinatorial notion of 'tight structures'. Albeit equivalent to recent work by H. Chitsaz and collaborators, our approach in addition provides a full-fledged computation of the base pairing probabilities, which relies on the notion of a decomposition tree for joint structures. In practise, our implementation is efficient enough to investigate, for instance, the interactions of small bacterial RNAs and their target mRNAs. AVAILABILITY: The program rip is implemented in C. The source code is available for download from http://www.combinatorics.cn/cbpc/rip.html and http://www.bioinf.uni-leipzig.de/Software/rip.html.


Assuntos
Pareamento de Bases , Biologia Computacional/métodos , RNA/química , Algoritmos , Bases de Dados Genéticas , Conformação de Ácido Nucleico , RNA/metabolismo , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...