Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 52(3): e14, 2024 Feb 09.
Article in English | MEDLINE | ID: mdl-38038257

ABSTRACT

Ribonucleic acid (RNA) is an essential molecule in a wide range of biological functions. In 1990, McCaskill introduced a dynamic programming algorithm for computing the partition function of an RNA sequence. McCaskill's algorithm is widely used today for understanding the thermodynamic properties of RNA. In this work, we introduce a generalization of McCaskill's algorithm that is well-defined over continuous inputs. Crucially, this enables us to implement an end-to-end differentiable partition function calculation. The derivative can be computed with respect to the input, or to any other fixed values, such as the parameters of the energy model. This builds a bridge between RNA thermodynamics and the tools of differentiable programming including deep learning as it enables the partition function to be incorporated directly into any end-to-end differentiable pipeline. To demonstrate the effectiveness of our new approach, we tackle the inverse folding problem directly using gradient optimization. We find that using the gradient to optimize the sequence directly is sufficient to arrive at sequences with a high probability of folding into the desired structure. This indicates that the gradients we compute are meaningful.


Subject(s)
Algorithms , RNA , RNA/genetics , RNA/chemistry , Nucleic Acid Conformation , Thermodynamics
2.
PLoS One ; 18(4): e0284532, 2023.
Article in English | MEDLINE | ID: mdl-37058526

ABSTRACT

Charcot-Marie-Tooth disease (CMT) is the most common inherited peripheral polyneuropathy in humans, and its subtypes are linked to mutations in dozens of different genes, including the gene coding for ganglioside-induced differentiation-associated protein 1 (GDAP1). The main GDAP1-linked CMT subtypes are the demyelinating CMT4A and the axonal CMT2K. Over a hundred different missense CMT mutations in the GDAP1 gene have been reported. However, despite implications for mitochondrial fission and fusion, cytoskeletal interactions, and response to reactive oxygen species, the etiology of GDAP1-linked CMT is poorly understood at the protein level. Based on earlier structural data, CMT-linked mutations could affect intramolecular interaction networks within the GDAP1 protein. We carried out structural and biophysical analyses on several CMT-linked GDAP1 protein variants and describe new crystal structures of the autosomal recessive R120Q and the autosomal dominant A247V and R282H GDAP1 variants. These mutations reside in the structurally central helices ⍺3, ⍺7, and ⍺8. In addition, solution properties of the CMT mutants R161H, H256R, R310Q, and R310W were analysed. All disease variant proteins retain close to normal structure and solution behaviour. All mutations, apart from those affecting Arg310 outside the folded GDAP1 core domain, decreased thermal stability. In addition, a bioinformatics analysis was carried out to shed light on the conservation and evolution of GDAP1, which is an outlier member of the GST superfamily. GDAP1-like proteins branched early from the larger group of GSTs. Phylogenetic calculations could not resolve the exact early chronology, but the evolution of GDAP1 is roughly as old as the splits of archaea from other kingdoms. Many known CMT mutation sites involve conserved residues or interact with them. A central role for the ⍺6-⍺7 loop, within a conserved interaction network, is identified for GDAP1 protein stability. To conclude, we have expanded the structural analysis on GDAP1, strengthening the hypothesis that alterations in conserved intramolecular interactions may alter GDAP1 stability and function, eventually leading to mitochondrial dysfunction, impaired protein-protein interactions, and neuronal degeneration.


Subject(s)
Charcot-Marie-Tooth Disease , Humans , Mutation , Nerve Tissue Proteins/metabolism , Phylogeny , Protein Stability
3.
Int J Mol Sci ; 22(15)2021 Jul 29.
Article in English | MEDLINE | ID: mdl-34360882

ABSTRACT

The human natural killer (HNK-1) carbohydrate plays important roles during nervous system development, regeneration after trauma and synaptic plasticity. Four proteins have been identified as receptors for HNK-1: the laminin adhesion molecule, high-mobility group box 1 and 2 (also called amphoterin) and cadherin 2 (also called N-cadherin). Because of HNK-1's importance, we asked whether additional receptors for HNK-1 exist and whether the four identified proteins share any similarity in their primary structures. A set of 40,000 sequences homologous to the known HNK-1 receptors was selected and used for large-scale sequence alignments and motif searches. Although there are conserved regions and highly conserved sites within each of these protein families, there was no sequence similarity or conserved sequence motifs found to be shared by all families. Since HNK-1 receptors have not been compared regarding binding constants and since it is not known whether the sulfated or non-sulfated part of HKN-1 represents the structurally crucial ligand, the receptors are more heterogeneous in primary structure than anticipated, possibly involving different receptor or ligand regions. We thus conclude that the primary protein structure may not be the sole determinant for a bona fide HNK-1 receptor, rendering receptor structure more complex than originally assumed.


Subject(s)
CD57 Antigens/metabolism , Cadherins/metabolism , HMGB1 Protein/metabolism , HMGB2 Protein/metabolism , Laminin/metabolism , Oligosaccharides/metabolism , Amino Acid Sequence , Animals , Binding Sites , CD57 Antigens/chemistry , Cadherins/chemistry , HMGB1 Protein/chemistry , HMGB2 Protein/chemistry , Humans , Laminin/chemistry , Ligands , Nerve Regeneration/physiology , Neuronal Plasticity/physiology , Oligosaccharides/chemistry , Protein Binding , Protein Domains
4.
Nat Commun ; 12(1): 3850, 2021 06 22.
Article in English | MEDLINE | ID: mdl-34158503

ABSTRACT

Three stop codons (UAA, UAG and UGA) terminate protein synthesis and are almost exclusively recognized by release factors. Here, we design de novo transfer RNAs (tRNAs) that efficiently decode UGA stop codons in Escherichia coli. The tRNA designs harness various functionally conserved aspects of sense-codon decoding tRNAs. Optimization within the TΨC-stem to stabilize binding to the elongation factor, displays the most potent effect in enhancing suppression activity. We determine the structure of the ribosome in a complex with the designed tRNA bound to a UGA stop codon in the A site at 2.9 Å resolution. In the context of the suppressor tRNA, the conformation of the UGA codon resembles that of a sense-codon rather than when canonical translation termination release factors are bound, suggesting conformational flexibility of the stop codons dependent on the nature of the A-site ligand. The systematic analysis, combined with structural insights, provides a rationale for targeted repurposing of tRNAs to correct devastating nonsense mutations that introduce a premature stop codon.


Subject(s)
Codon, Nonsense/genetics , Codon, Terminator/genetics , Escherichia coli/genetics , Protein Biosynthesis/genetics , RNA, Transfer/genetics , Ribosomes/genetics , Base Sequence , Binding Sites/genetics , Cryoelectron Microscopy , Escherichia coli/metabolism , Models, Molecular , Nucleic Acid Conformation , Peptide Termination Factors/genetics , Peptide Termination Factors/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Transfer/chemistry , RNA, Transfer/metabolism , Ribosomes/metabolism , Ribosomes/ultrastructure , Suppression, Genetic
5.
PLoS One ; 16(5): e0251459, 2021.
Article in English | MEDLINE | ID: mdl-33989344

ABSTRACT

Synaptic plasticity is vital for brain function and memory formation. One of the key proteins in long-term synaptic plasticity and memory is the activity-regulated cytoskeleton-associated protein (Arc). Mammalian Arc forms virus-like capsid structures in a process requiring the N-terminal domain and contains two C-terminal lobes that are structural homologues to retroviral capsids. Drosophila has two isoforms of Arc, dArc1 and dArc2, with low sequence similarity to mammalian Arc, but lacking a large N-terminal domain. Both dArc isoforms are related to the Ty3/gypsy retrotransposon capsid, consisting of N- and C-terminal lobes. Structures of dArc1, as well as capsids formed by both dArc isoforms, have been recently determined. We carried out structural characterization of the four individual dArc lobe domains. As opposed to the corresponding mammalian Arc lobe domains, which are monomeric, the dArc lobes were all oligomeric in solution, indicating a strong propensity for homophilic interactions. A truncated N-lobe from dArc2 formed a domain-swapped dimer in the crystal structure, resulting in a novel dimer interaction that could be relevant for capsid assembly or other dArc functions. This domain-swapped structure resembles the dimeric protein C of flavivirus capsids, as well as the structure of histones dimers, domain-swapped transcription factors, and membrane-interacting BAK domains. The strong oligomerization properties of the isolated dArc lobe domains explain the ability of dArc to form capsids in the absence of any large N-terminal domain, in contrast to the mammalian protein.


Subject(s)
Cytoskeletal Proteins/chemistry , Drosophila/chemistry , Nerve Tissue Proteins/chemistry , Amino Acid Sequence , Animals , Crystallography, X-Ray , Models, Molecular , Protein Conformation , Protein Domains , Protein Multimerization
6.
J Chem Theory Comput ; 15(5): 3402-3409, 2019 May 14.
Article in English | MEDLINE | ID: mdl-31002506

ABSTRACT

The NAST force field is a popular tool for modeling RNA and is typical of low-resolution approaches. Unfortunately, some combinations of bond and dihedral angles can reach cliffs on the energy landscape which lead to numerical disasters. We describe changes to the formulation (NAST improved, NASTI) which smooth the dihedral energy term when neighboring angles become flat. We also improved the fit to experimental structures by replacing the harmonic term for the backbone angles with spline functions and using a more sophisticated approach to calculate energies for fragments that span both helix and loop regions. A newer, larger set of structures was used for the parametrization. The new formulation can be run for millions of steps without a thermostat, whereas NAST routinely suffers numerical catastrophes. Simulations with NASTI showed no decrease in the quality of the structures as reflected by slightly better GDT-TS scores and, in three of the five cases, marginally better RMSD values when compared to the crystal structures.


Subject(s)
Computer Simulation , Nucleic Acid Conformation , RNA/chemistry , Thermodynamics
7.
Front Microbiol ; 7: 1010, 2016.
Article in English | MEDLINE | ID: mdl-27446048

ABSTRACT

The two-domain protein PduO, involved in 1,2-propanediol utilization in the pathogenic Gram-negative bacterium Salmonella enterica is an ATP:Cob(I)alamin adenosyltransferase, but this is a function of the N-terminal domain alone. The role of its C-terminal domain (PduOC) is, however, unknown. In this study, comparative growth assays with a set of Salmonella mutant strains showed that this domain is necessary for effective in vivo catabolism of 1,2-propanediol. It was also shown that isolated, recombinantly-expressed PduOC binds heme in vivo. The structure of PduOC co-crystallized with heme was solved (1.9 Å resolution) showing an octameric assembly with four heme moieities. The four heme groups are highly solvent-exposed and the heme iron is hexa-coordinated with bis-His ligation by histidines from different monomers. Static light scattering confirmed the octameric assembly in solution, but a mutation of the heme-coordinating histidine caused dissociation into dimers. Isothermal titration calorimetry using the PduOC apoprotein showed strong heme binding (K d = 1.6 × 10(-7) M). Biochemical experiments showed that the absence of the C-terminal domain in PduO did not affect adenosyltransferase activity in vitro. The evidence suggests that PduOC:heme plays an important role in the set of cobalamin transformations required for effective catabolism of 1,2-propanediol. Salmonella PduO is one of the rare proteins which binds the redox-active metabolites heme and cobalamin, and the heme-binding mode of the C-terminal domain differs from that in other members of this protein family.

8.
Brain Res ; 1641(Pt A): 64-78, 2016 06 15.
Article in English | MEDLINE | ID: mdl-26367445

ABSTRACT

2',3'-cyclic nucleotide 3'-phosphodiesterase (CNPase) is an abundant membrane-associated enzyme within the vertebrate myelin sheath. While the physiological function of CNPase still remains to be characterized in detail, it is known - in addition to its in vitro enzymatic activity - to interact with other proteins, small molecules, and membrane surfaces. From an evolutionary point of view, it can be deduced that CNPase is not restricted to myelin-forming cells or vertebrate tissues. Its evolution has involved gene fusion, addition of other small segments with distinct functions, such as membrane attachment, and possibly loss of function at the polynucleotide kinase-like domain. Currently, it is unclear whether the enzymatic function of the conserved phosphodiesterase domain in vertebrate myelin has a physiological role, or if CNPase could actually function - like many other classical myelin proteins - in a more structural role. This article is part of a Special Issue entitled SI: Myelin Evolution.


Subject(s)
2',3'-Cyclic Nucleotide 3'-Phosphodiesterase/genetics , 2',3'-Cyclic Nucleotide 3'-Phosphodiesterase/metabolism , Biological Evolution , Animals , Humans
10.
PLoS Comput Biol ; 11(9): e1004511, 2015.
Article in English | MEDLINE | ID: mdl-26393792

ABSTRACT

A lipidome is the set of lipids in a given organism, cell or cell compartment and this set reflects the organism's synthetic pathways and interactions with its environment. Recently, lipidomes of biological model organisms and cell lines were published and the number of functional studies of lipids is increasing. In this study we propose a homology metric that can quantify systematic differences in the composition of a lipidome. Algorithms were developed to 1. consistently convert lipids structure into SMILES, 2. determine structural similarity between molecular species and 3. describe a lipidome in a chemical space model. We tested lipid structure conversion and structure similarity metrics, in detail, using sets of isomeric ceramide molecules and chemically related phosphatidylinositols. Template-based SMILES showed the best properties for representing lipid-specific structural diversity. We also show that sequence analysis algorithms are best suited to calculate distances between such template-based SMILES and we adjudged the Levenshtein distance as best choice for quantifying structural changes. When all lipid molecules of the LIPIDMAPS structure database were mapped in chemical space, they automatically formed clusters corresponding to conventional chemical families. Accordingly, we mapped a pair of lipidomes into the same chemical space and determined the degree of overlap by calculating the Hausdorff distance. We named this metric the 'Lipidome jUXtaposition (LUX) score'. First, we tested this approach for estimating the lipidome similarity on four yeast strains with known genetic alteration in fatty acid synthesis. We show that the LUX score reflects the genetic relationship and growth temperature better than conventional methods although the score is based solely on lipid structures. Next, we applied this metric to high-throughput data of larval tissue lipidomes of Drosophila. This showed that the LUX score is sufficient to cluster tissues and determine the impact of nutritional changes in an unbiased manner, despite the limited information on the underlying structural diversity of each lipidome. This study is the first effort to define a lipidome homology metric based on structures that will enrich functional association of lipids in a similar manner to measures used in genetics. Finally, we discuss the significance of the LUX score to perform comparative lipidome studies across species borders.


Subject(s)
Computational Biology/methods , Lipids/chemistry , Models, Biological , Models, Molecular , Algorithms , Animals , Drosophila , Isomerism , Lipids/analysis , Organ Specificity
11.
Cell Mol Life Sci ; 72(21): 4193-203, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26012696

ABSTRACT

Gene fusion is a common mechanism of protein evolution that has mainly been discussed in the context of multidomain or symmetric proteins. Less is known about fusion of ancestral genes to produce small single-domain proteins. Here, we show with a domain-swapped mutant Plasmodium profilin that this small, globular, apparently single-domain protein consists of two foldons. The separation of binding sites for different protein ligands in the two halves suggests evolution via an ancient gene fusion event, analogous to the formation of multidomain proteins. Finally, the two fragments can be assembled together after expression as two separate gene products. The possibility to engineer both domain-swapped dimers and half-profilins that can be assembled back to a full profilin provides perspectives for engineering of novel protein folds, e.g., with different scaffolding functions.


Subject(s)
Evolution, Molecular , Gene Fusion , Plasmodium falciparum/chemistry , Profilins/chemistry , Profilins/genetics , Circular Dichroism , Crystallography, X-Ray , Exons , Introns , Models, Molecular , Mutation , Protein Folding , Protein Multimerization , Protein Subunits/chemistry , Protein Subunits/genetics , Protozoan Proteins/chemistry , Protozoan Proteins/genetics , Scattering, Small Angle , X-Ray Diffraction
12.
J Biol Chem ; 289(49): 34214-28, 2014 Dec 05.
Article in English | MEDLINE | ID: mdl-25342754

ABSTRACT

The extracellular protein HbpS from Streptomyces reticuli interacts with iron ions and heme. It also acts in concert with the two-component sensing system SenS-SenR in response to oxidative stress. Sequence comparisons suggested that the protein may bind a cobalamin. UV-visible spectroscopy confirmed binding (Kd = 34 µm) to aquo-cobalamin (H2OCbl(+)) but not to other cobalamins. Competition experiments with the H2OCbl(+)-coordinating ligand CN(-) and comparison of mutants identified a histidine residue (His-156) that coordinates the cobalt ion of H2OCbl(+) and substitutes for water. HbpS·Cobalamin lacks the Asp-X-His-X-X-Gly motif seen in some cobalamin binding enzymes. Preliminary tests showed that a related HbpS protein from a different species also binds H2OCbl(+). Furthermore, analyses of HbpS-heme binding kinetics are consistent with the role of HbpS as a heme-sensor and suggested a role in heme transport. Given the high occurrence of HbpS-like sequences among Gram-positive and Gram-negative bacteria, our findings suggest a great functional versatility among these proteins.


Subject(s)
Bacterial Proteins/chemistry , Carrier Proteins/chemistry , Heme/chemistry , Hemeproteins/chemistry , Soil Microbiology , Streptomyces/chemistry , Vitamin B 12/analogs & derivatives , Amino Acid Sequence , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Binding Sites , Binding, Competitive , Biological Transport , Carrier Proteins/genetics , Carrier Proteins/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism , Evolution, Molecular , Gene Expression , Heme/metabolism , Heme-Binding Proteins , Hemeproteins/genetics , Hemeproteins/metabolism , Histidine/chemistry , Histidine/metabolism , Iron/metabolism , Kinetics , Models, Molecular , Molecular Sequence Data , Mutation , Protein Binding , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Sequence Alignment , Streptomyces/genetics , Streptomyces/metabolism , Structural Homology, Protein , Vitamin B 12/chemistry , Vitamin B 12/metabolism
13.
Algorithms Mol Biol ; 9: 18, 2014.
Article in English | MEDLINE | ID: mdl-25053971

ABSTRACT

One can search for messages in the digits of π or a Kazakhstan telephone book, but there may be hidden messages closer to home. A recent publication in this journal purportedly compared a set of multiple sequence alignment programs. The real purpose of the article may have been to remind readers how to present scientific data.

14.
Bioinformatics ; 29(22): 2941-2, 2013 Nov 15.
Article in English | MEDLINE | ID: mdl-23975766

ABSTRACT

SUMMARY: There are many programs that can read the secondary structure of an RNA molecule and draw a diagram, but hardly any that can cope with 10(3) bases. RNAfdl is slow but capable of producing intersection-free diagrams for ribosome-sized structures, has a graphical user interface for adjustments and produces output in common formats. AVAILABILITY AND IMPLEMENTATION: Source code is available under the GNU General Public License v3.0 at http://sourceforge.net/projects/rnafdl for Linux and similar systems or Windows using MinGW. RNAfdl is implemented in C, uses the Cairo 2D graphics library and offers both command line and graphical user interfaces. CONTACT: hecker@rth.dk


Subject(s)
RNA/chemistry , Software , Nucleic Acid Conformation , RNA, Ribosomal, 23S/chemistry
15.
FEMS Microbiol Lett ; 342(2): 106-12, 2013 May.
Article in English | MEDLINE | ID: mdl-23373615

ABSTRACT

The extracellular haem-binding protein from Streptomyces reticuli (HbpS) has been shown to be involved in redox sensing and to bind haem. However, the residues involved in haem coordination are unknown. Structural alignments to distantly related haem-binding proteins from Mycobacterium tuberculosis were used to identify a candidate haem-coordinating residue, and site-directed mutagenesis with UV/Vis spectroscopy was used to assess haem binding in vivo and in vitro. We present strong evidence that HbpS belongs to the small set of proteins, which do not use histidine to coordinate the metal in the haem group. Further spectroscopic evidence strongly indicates that threonine 113 is actively involved in coordination of haem. Subsequent protein/haem titration experiments show a 1 : 2, protein/haem stoichiometry. We also present data showing the degradation of haem by HbpS in vivo. Because HbpS is conserved in many Actinobacteria, the presented results are applicable to related species.


Subject(s)
Hemeproteins/chemistry , Hemeproteins/genetics , Streptomyces/chemistry , Streptomyces/genetics , Amino Acid Sequence , Binding Sites , Heme/metabolism , Hemeproteins/metabolism , Models, Molecular , Mutagenesis, Site-Directed , Mycobacterium tuberculosis/chemistry , Mycobacterium tuberculosis/genetics , Protein Binding , Protein Conformation , Sequence Alignment , Spectrophotometry
16.
Bioinformatics ; 29(5): 588-96, 2013 Mar 01.
Article in English | MEDLINE | ID: mdl-23314325

ABSTRACT

MOTIVATION: To recognize remote relationships between RNA molecules, one must be able to align structures without regard to sequence similarity. We have implemented a method, which is swift [O(n(2))], sensitive and tolerant of large gaps and insertions. Molecules are broken into overlapping fragments, which are characterized by their memberships in a probabilistic classification based on local geometry and H-bonding descriptors. This leads to a probabilistic similarity measure that is used in a conventional dynamic programming method. RESULTS: Examples are given of database searching, the detection of structural similarities, which would not be found using sequence based methods, and comparisons with a previously published approach. AVAILABILITY AND IMPLEMENTATION: Source code (C and perl) and binaries for linux are freely available at www.zbh.uni-hamburg.de/fries.


Subject(s)
Algorithms , RNA/chemistry , Databases, Protein , Models, Molecular , Nucleic Acid Conformation , Sequence Alignment , Sequence Analysis, RNA
17.
RNA Biol ; 10(2): 216-27, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23235494

ABSTRACT

Aptamers are oligonucleotides that bind targets with high specificity and affinity. They have become important tools for biosensing, target detection, drug delivery and therapy. We selected the quadruplex-forming 16-mer DNA aptamer AID-1 [d(GGGT) 4] with affinity for the interleukin-6 receptor (IL-6R) and identified single nucleotide variants that showed no significant loss of binding ability. The RNA counterpart of AID-1 [r(GGGU) 4] also bound IL-6R as quadruplex structure. AID-1 is identical to the well-known HIV inhibitor T30923, which inhibits both HIV infection and HIV-1 integrase. We also demonstrated that IL-6R specific RNA aptamers not only bind HIV-1 integrase and inhibit its 3' processing activity in vitro, but also are capable of preventing HIV de novo infection with the same efficacy as the established inhibitor T30175. All these aptamer target interactions are highly dependent on formation of quadruplex structure.


Subject(s)
Aptamers, Nucleotide/pharmacology , HIV Integrase Inhibitors/pharmacology , HIV-1/drug effects , Receptors, Interleukin-6/metabolism , Circular Dichroism , Drug Evaluation, Preclinical , G-Quadruplexes/drug effects , HIV Envelope Protein gp120/genetics , HIV Envelope Protein gp120/metabolism , HIV Infections/pathology , HIV Infections/virology , HIV Integrase/genetics , HIV Integrase/metabolism , HIV-1/enzymology , HIV-1/pathogenicity , HeLa Cells , Humans , Oligonucleotides/pharmacology , Virus Attachment/drug effects
18.
J Chem Theory Comput ; 8(10): 3663-70, 2012 Oct 09.
Article in English | MEDLINE | ID: mdl-26593011

ABSTRACT

We have implemented a method for the design of RNA sequences that should fold to arbitrary secondary structures. A popular energy model allows one to take the derivative with respect to composition, which can then be interpreted as a force and used for Newtonian dynamics in sequence space. Combined with a negative design term, one can rapidly sample sequences which are compatible with a desired secondary structure via simulated annealing. Results for 360 structures were compared with those from another nucleic acid design program using measures such as the probability of the target structure and an ensemble-weighted distance to the target structure.

19.
J Comput Chem ; 31(6): 1135-42, 2010 Apr 30.
Article in English | MEDLINE | ID: mdl-19899145

ABSTRACT

We propose a method for predicting RNA base pairing which imposes no restrictions on the order of base pairs, allows for pseudoknots and runs in O(mN(2)) time for N base pairs and m iterations. It employs a self-consistent mean field method in which all base pairs are possible, but with each iteration, the most energetically favored base pairs become more likely as long as they are consistent with their neighbors. Performance was compared against three other programs using three test sets. Sensitivity varied from 20% to 74% and specificity from 44% to 77% and generally, the method predicts too many base pairs leading to good sensitivity and worse specificity. The predicted structures have excellent energies suggesting that, algorithmically, the method performs well, but the classic literature energy models may not be appropriate when pseudoknots are permitted. Website and source code for the simulations are available at http://cardigan.zbh.uni-hamburg.de/~rnascmf.


Subject(s)
Models, Chemical , Nucleic Acid Conformation , RNA/chemistry , Base Pairing , Thermodynamics
20.
Phys Rev E Stat Nonlin Soft Matter Phys ; 79(6 Pt 1): 061911, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19658528

ABSTRACT

We propose an order index, phi, which gives a quantitative measure of randomness and order of complete genomic sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length. The 786 complete genomic sequences in GenBank were found to have phi values in a very narrow range, phig=0.031(-0.015)+0.028. We show this implies that genomes are halfway toward being completely random, or, at the "edge of chaos." We further show that artificial "genomes" converted from literary classics have phi 's that almost exactly coincide with phig, but sequences of low information content do not. We infer that phig represents a high information-capacity "fixed point" in sequence space, and that genomes are driven to it by the dynamics of a robust growth and evolution process. We show that a growth process characterized by random segmental duplication can robustly drive genomes to the fixed point.


Subject(s)
Genome/genetics , Models, Genetic , Models, Statistical , Sequence Analysis, DNA/methods , Base Sequence , Computer Simulation , Data Interpretation, Statistical , Molecular Sequence Data , Mutation/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...