ABSTRACT
In this work, we formulate the following question: How the distribution of aminoacyl-tRNA synthetases (aaRSs) went from an ancestral bidirectional gene (mirror symmetry) to the symmetrical distribution of aaRSs in a six-dimensional hypercube of the Standard Genetic Code (SGC)? We assume a primeval RNY code, two Extended Genetic RNA codes type 1 and 2, and the SGC. We outline the types of symmetries of the distribution of aaRSs in each code. The symmetry groups of aaRSs in each code are described, until the symmetries of the SGC display a mirror symmetry. Considering both Extended RNA codes the 20 aaRSs were already present before the Last Universal Ancestor. These findings reveal intricacies in the diversification of aaRSs accompanied by the evolution of the genetic code.
Subject(s)
Amino Acyl-tRNA Synthetases , Evolution, Molecular , Genetic Code , Amino Acyl-tRNA Synthetases/genetics , RNA, Transfer/genetics , RNAABSTRACT
One of the major evolutionary transitions that led to DNA replacing RNA as the primary informational molecule in biological systems is still the subject of an intense debate in the scientific community. DNA polymerases are currently split into various families. Families A, B, and C are the most significant. In bacteria and some types of viruses, enzymes from families A and C predominate, whereas family B enzymes are more common in Archaea, Eukarya, and some types of viruses. A phylogenetic analysis of these three families of DNA polymerase was carried out. We assumed that reverse transcriptase was the ancestor of DNA polymerases. Our findings suggest that families A and C emerged and organized themselves when the earliest bacterial lineages had diverged, and that these earliest lineages had RNA genomes that were in transition-that is, the information was temporally stored in DNA molecules that were continuously being produced by reverse transcription. The origin of DNA and the apparatus for its replication in the mitochondrial ancestors may have occurred independently of DNA and the replication machinery of other bacterial lineages, according to these two alternate modes of genetic material replication. The family C enzymes emerged in a particular bacterial lineage before being passed to viral lineages, which must have functioned by disseminating this machinery to the other lineages of bacteria. Bacterial DNA viruses must have evolved at least twice independently, in addition to the requirement that DNA have arisen twice in bacterial lineages. We offer two possible scenarios based on what we know about bacterial DNA polymerases. One hypothesis contends that family A was initially produced and spread to the other lineages through viral lineages before being supplanted by the emergence of family C and acquisition at that position of the principal replicative polymerase. The evidence points to the independence of these events and suggests that the viral lineage's acquisition of cellular replicative machinery was crucial for the establishment of a DNA genome in the other bacterial lineages, since these viral lineages may have served as a conduit for the machinery's delivery to other bacterial lineages that diverged with the RNA genome. Our data suggest that family B initially established itself in viral lineages and was transferred to ancestral Archaea lineages before the group diversified; thus, the DNA genome must have emerged first in this cellular lineage. Our data point to multiple evolutionary steps in the origins of DNA polymerase, having started off at least twice in the bacterial lineage and once in the archaeal lineage. Given that viral lineages are implicated in a significant portion of the distribution of DNA replication equipment in both bacterial (families A and C) and Archaeal lineages (family A), our data point to a complex scenario.
Subject(s)
Bacteriophages , Viruses , Phylogeny , Evolution, Molecular , DNA-Directed DNA Polymerase/genetics , Viruses/genetics , Bacteria/genetics , DNA , Archaea/genetics , Bacteriophages/genetics , RNAABSTRACT
The origin of life was a cosmic event happened on primitive Earth. A critical problem to better understand the origins of life in Earth is the search for chemical scenarios on which the basic building blocks of biological molecules could be produced. Classic works in pre-biotic chemistry frequently considered early Earth as an homogeneous atmosphere constituted by chemical elements such as methane (CH4), ammonia (NH3), water (H2O), hydrogen (H2) and hydrogen sulfide (H2S). Under that scenario, Stanley Miller was capable to produce amino acids and solved the question about the abiotic origin of proteins. Conversely, the origin of nucleic acids has tricked scientists for decades once nucleotides are complex, though necessary molecules to allow the existence of life. Here we review possible chemical scenarios that allowed not only the formation of nucleotides but also other significant biomolecules. We aim to provide a theoretical solution for the origin of biomolecules at specific sites named "Prebiotic Chemical Refugia." Prebiotic chemical refugium should therefore be understood as a geographic site in prebiotic Earth on which certain chemical elements were accumulated in higher proportion than expected, facilitating the production of basic building blocks for biomolecules. This higher proportion should not be understood as static, but dynamic; once the physicochemical conditions of our planet changed periodically. These different concentration of elements, together with geochemical and astronomical changes along days, synodic months and years provided somewhat periodic changes in temperature, pressure, electromagnetic fields, and conditions of humidity, among other features. Recent and classic works suggesting most likely prebiotic refugia on which the main building blocks for biological molecules might be accumulated are reviewed and discussed.
Subject(s)
Origin of Life , Refugium , Earth, Planet , Atmosphere/chemistry , Nucleotides , Evolution, ChemicalABSTRACT
The evolutionary history of Class I aminoacyl-tRNA synthetases (aaRS) through the reconstruction of ancestral sequences is presented. From structural molecular modeling, we sought to understand its relationship with the acceptor arms and the tRNA anticodon loop, how this relationship was established, and the possible implications in determining the genetic code and the translation system. The results of the molecular docking showed that in 7 out 9 aaRS, the acceptor arm and the anticodon loop bond practically in the same region. Domain accretion process in aaRS and repositioning of interactions between tRNAs and aaRS are illustrated. Based on these results, we propose that the operational code and the anticodon code coexisted, competing for the aaRS catalytic region, while consequently contributed to the stabilization of these proteins.
Subject(s)
Amino Acyl-tRNA Synthetases , Genetic Code , Amino Acyl-tRNA Synthetases/genetics , Anticodon/genetics , Molecular Docking Simulation , RNA, Transfer/geneticsABSTRACT
Although the knowledge about biological systems has advanced exponentially in recent decades, it is surprising to realize that the very definition of Life keeps presenting theoretical challenges. Even if several lines of reasoning seek to identify the essence of life phenomenon, most of these thoughts contain fundamental problem in their basic conceptual structure. Most concepts fail to identify either necessary or sufficient features to define life. Here, we analyzed the main conceptual frameworks regarding theoretical aspects that have been supporting the most accepted concepts of life, such as (i) the physical, (ii) the cellular and (iii) the molecular approaches. Based on an ontological analysis, we propose that Life should not be positioned under the ontological category of Matter. Yet, life should be better understood under the top-level ontology of "Process". Exercising an epistemological approach, we propose that the essential characteristic that pervades each and every living being is the presence of organic codes. Therefore, we explore theories in biosemiotics and code biology in order to propose a clear concept of life as a macrocode composed by multiple inter-related coding layers. This way, as life is a sort of metaphysical process of encoding, the living beings became the molecular materialization of that process. From the proposed concept, we show that the evolutionary process is a fundamental characteristic for life's maintenance but it is not necessary to define life, as many organisms are clearly alive but they do not participate in the evolutionary process (such as infertile hybrids). The current proposition opens a fertile field of debate in astrobiology, epistemology, biosemiotics, code biology and robotics.
Subject(s)
Biological EvolutionABSTRACT
We tested the hypothesis that concatemers of ancestral tRNAs gave rise to the 16S ribosomal RNA. We built an ancestral sequence of proto-tRNAs that showed a significant identity of 51.69% and a percentage of structural identity of 0.941 with the 3' upper domain of 16S ribosomal molecule. We also propose a hypothesis in which the small ribosomal subunit emerged by proto-tRNA fusion and worked as a point to bind RNAs in an open structure configuration. In this context, the two ribosomal subunits initially worked independently, and that the subunit junction, with consequent primitive ribosome formation, was mediated by interactions with tRNA molecules during the primordial genetic code formation.
Subject(s)
Evolution, Molecular , RNA, Transfer , Genetic Code , Nucleic Acid Conformation , RNA, Ribosomal , RNA, Ribosomal, 16S/genetics , RNA, Transfer/genetics , Ribosomes/geneticsABSTRACT
The theory of chemical symbiosis (TCS) suggests that biological systems started with the collaboration of two polymeric molecules existing in early Earth: nucleic acids and peptides. Chemical symbiosis emerged when RNA-like nucleic acid polymers happened to fold into 3D structures capable to bind amino acids together, forming a proto peptidyl-transferase center. This folding catalyzed the formation of quasi-random small peptides, some of them capable to bind this ribozyme structure back and starting to form an initial layer that would produce the larger subunit of the ribosome by accretion. TCS suggests that there is no chicken-and-egg problem into the emergence of biological systems as RNAs and peptides were of equal importance to the origin of life. Life has initially emerged when these two macromolecules started to interact in molecular symbiosis. Further, we suggest that life evolved into progenotes and cells due to the emergence of new layers of symbiosis. Mutualism is the strongest force in biology, capable to create novelties by emergent principles; on which the whole is bigger than the sum of the parts. TCS aims to apply the Margulian view of biology into the origins of life field.
Subject(s)
Evolution, Molecular , Models, Theoretical , Origin of Life , Peptide Fragments/metabolism , Proteins/metabolism , RNA/metabolism , Symbiosis , Humans , Models, Biological , Peptide Fragments/chemistry , Proteins/chemistry , RNA/chemistryABSTRACT
Viruses have generally been thought of as infectious agents. New data on mimivirus, however, suggests a reinterpretation of this thought. Earth's biosphere seems to contain many more viruses than previously thought and they are relevant in the maintenance of ecosystems and biodiversity. Viruses are not considered to be alive because they are not free-living entities and do not have cellular units. Current hypotheses indicate that some viruses may have been the result of genomic reduction of cellular life forms. However, new studies relating to the origins of biological systems suggest that viruses could also have originated during the transition from First to the Last Universal Common Ancestor (from FUCA to LUCA). Within this setting, life has been established as chemical informational system and could be interpreted as a macrocode of multiple layers. The first entity to acquire these features was the First Universal Common Ancestor (FUCA) that evolved to an intermediate ancestral that could be named T-LUCA (Transitional-LUCA) and be equated to Woese's concept of progenotes. T-LUCA may have remained as undifferentiated subsystems with viruses-like structures. The net result is that both cellular life forms and viruses shared protein synthesis apparatuses. In short, virus is a strategy of life reached by two paths: T-LUCAs like entities and the reduction of cellular life forms.
Subject(s)
Biological Evolution , Virus Physiological Phenomena , Viruses , Evolution, MolecularABSTRACT
Ureaplasma diversum is a member of the Mollicutes class responsible for urogenital tract infection in cattle and small ruminants. Studies indicate that the process of horizontal gene transfer, the exchange of genetic material among different species, has a crucial role in mollicute evolution, affecting the group's characteristic genomic reduction process and simplification of metabolic pathways. Using bioinformatics tools and the STRING database of known and predicted protein interactions, we constructed the protein-protein interaction network of U. diversum and compared it with the networks of other members of the Mollicutes class. We also investigated horizontal gene transfer events in subnetworks of interest involved in purine and pyrimidine metabolism and urease function, chosen because of their intrinsic importance for host colonization and virulence. We identified horizontal gene transfer events among Mollicutes and from Ureaplasma to Staphylococcus aureus and Corynebacterium, bacterial groups that colonize the urogenital niche. The overall tendency of genome reduction and simplification in the Mollicutes is echoed in their protein interaction networks, which tend to be more generalized and less selective. Our data suggest that the process was permitted (or enabled) by an increase in host dependence and the available gene repertoire in the urogenital tract shared via horizontal gene transfer.
Subject(s)
Bacterial Proteins/metabolism , Gene Transfer, Horizontal , Genome, Bacterial , Protein Interaction Maps , Tenericutes/genetics , Ureaplasma/genetics , Animals , Bacterial Proteins/genetics , Cattle , Corynebacterium/genetics , Evolution, Molecular , Genome Size , Genomics , Metabolic Networks and Pathways , Purines/metabolism , Pyrimidines/metabolism , Staphylococcus aureus/genetics , Tenericutes/classification , Tenericutes/metabolism , Ureaplasma/classification , Ureaplasma/metabolism , VirulenceABSTRACT
A neutral evolution model that explicitly considers codons, amino acids, and the degeneracy of the genetic code is developed. The model is built from nucleotides up to amino acids, and it represents a refinement of the neutral theory of molecular evolution. The model is based on a stochastic process that leads to a stationary probability distribution of amino acids. The latter is used as a neutral test of evolution. We provide some examples for assessing the neutrality test for a small set of protein sequences. The Jukes-Cantor model is generalized to deal with amino acids and it is compared with our neutral model, along with the empirical BLOSUM62 substitution model. The neutral test provides a baseline to which the evolution of any protein can be analyzed, and it clearly helps in discerning putative amino acids with unexpected frequencies that might be under positive or negative selection. Our model and neutral test are as universal as the standard genetic code.
Subject(s)
Amino Acid Substitution , Genetic Drift , Models, Genetic , Amino Acid Sequence , Amino Acid Substitution/genetics , Evolution, Molecular , ProteinsABSTRACT
Three-dimensional algebraic models, also called Genetic Hotels, are developed to represent the Standard Genetic Code, the Standard tRNA Code (S-tRNA-C), and the Human tRNA code (H-tRNA-C). New algebraic concepts are introduced to be able to describe these models, to wit, the generalization of the 2n-Klein Group and the concept of a subgroup coset with a tail. We found that the H-tRNA-C displayed broken symmetries in regard to the S-tRNA-C, which is highly symmetric. We also show that there are only 12 ways to represent each of the corresponding phenotypic graphs of amino acids. The averages of statistical centrality measures of the 12 graphs for each of the three codes are carried out and they are statistically compared. The phenotypic graphs of the S-tRNA-C display a common triangular prism of amino acids in 10 out of the 12 graphs, whilst the corresponding graphs for the H-tRNA-C display only two triangular prisms. The graphs exhibit disjoint clusters of amino acids when their polar requirement values are used. We contend that the S-tRNA-C is in a frozen-like state, whereas the H-tRNA-C may be in an evolving state.
ABSTRACT
The origin and evolution of life on the planet is one of the most intriguing challenges in life sciences and, for some researchers, it is centered in the origin of the genetic code. Many hypotheses about the origin and evolution of tRNA have been proposed and in this work a new suggestion is proposed based on the reconstruction of tRNA ancestor sequences. Ancestral sequences of 22 types of tRNA molecules were built by maximum likelihood from 9758 sequences currently reported from different organisms. Phylogenetic analysis showed that the main force for evolutionary diversification of tRNA molecules was a change in the second base of the anticodon. The data revealed that diversification is not correlated with the characteristic of the specified amino acid, indicating that the correlation between tRNA and amino acid was given indirectly, and possibly should have been mediated by proto-aminoacyl-tRNA synthetases.
Subject(s)
Anticodon/genetics , Evolution, Molecular , Models, Genetic , Phylogeny , Amino Acyl-tRNA Synthetases/geneticsABSTRACT
A model for the formation of the genetic code is presented where protein synthesis is directed initially by tRNA dimers. Proteins that are resistant to degradation and efficient RNA-binders protect the RNAs. Replication becomes elongational producing poly-tRNAs from which the mRNAs and ribosomes are derived. Attributions are successively fixed to tRNAs paired through the perfect palindromic anticodons, with the same bases at the extremities (5'ANA: UNU 3'; GNG: CNC; principal dinucleotides, pDiN). The 5' degeneracy is then developed. The first pairs to be encoded correspond to the hydropathy correlation outliers (Gly-CC: Pro-GG and Ser-GA: Ser-CU) and to the sector of homogeneous pDiN, composed by two pyrimidines or two purines. These amino acids are preferred in the N-ends of proteins, stabilizers of proteins against catabolism and strong RNA-binders. The next pairs complete the sector of homogeneous pDiN (Asp, Glu-UC: Leu-AG and Asn, Lys-UU: Phe-AA). This set of nine amino acids forms the protein cores with the predominant aperiodic conformation. Next enter the pairs with mixed pDiN (one purine and one pyrimidine), the RY attributions composing the protein N-ends and the YR attributions the C-ends. The last pair contains the main punctuation signs (Ile, Met, iMet-AU: Tyr, Stop-UA). The model indicates that genetic information emerged during the process of formation of the coding/decoding system and that genes were defined by the proteins. Stable proteins constructed the nucleoprotein system by binding to the RNAs that produced them. In this circular rationale, genes are memories in a metabolic system for production of proteins that stabilize it. The simplicity and the highly deterministic character of the process suggest that the Last Universal Common Ancestor populations could be composed, in early stages, of lineages bearing similar genetic codes.
Subject(s)
Amino Acids/chemistry , Genetic Code , Amino Acid Sequence , Anticodon , Codon , Dimerization , Models, Biological , Models, Genetic , Models, Theoretical , Nucleotides/chemistry , Purines/chemistry , RNA/chemistry , RNA, Transfer/chemistry , Ribosomes/chemistryABSTRACT
Knowledge on the evolution of aminoacyl-tRNA synthetases is crucial to studies on the origins of life. The relationships between the different aminoacyl-tRNA synthetase specificities in prokaryotic organisms are studied in this work. We reconstructed the ancestor sequences and the phylogenetic relationships utilizing the Maximum Likelihood method. The results suggest that in class I the evolution of the N-terminal segment was strongly influenced by the amino acid hydropathy in both domains of prokaryotes. The results for the C-terminal segments of class I were different in the two domains, indicating that its evolution was strongly influenced by the specific types of tRNA modification in each domain. The class II groups in Archaea were more heterogeneous with respect to the hydropathy of amino acids, indicating the interference of other influences. In bacteria, the configuration was also complex but the overall consensual division in two groups was maintained, group IIa forming a single branch with the five hydroapathetic amino acid specificities and group IIb containing the specificities for the moderately hydrophobic together with the hydrophilic amino acids. It is indicated that the aminoacyl-tRNA synthetase in both domains were subjected to different selective forces in diverse parts of the proteins, resulting in complex phylogenetic patterns.
Subject(s)
Amino Acyl-tRNA Synthetases/genetics , Prokaryotic Cells/enzymology , Animals , Archaea/enzymology , Bacteria/enzymology , Evolution, Molecular , PhylogenyABSTRACT
Variations of arginine codon usage between organisms may have important implications to thermostability. The preferential usage of AGR codons for arginine in thermophiles and hyperthermophiles implies positive error minimization, contributing to avoid mutations that could harm protein thermostability. This bias is not a mere consequence of increased G + C content, as it has been previously suggested, and may represent a new mechanism of adaptation to protein thermostability.
Subject(s)
Arginine/genetics , Codon/genetics , Proteins/chemistry , Proteins/genetics , Base Composition , Drug Stability , Hot TemperatureABSTRACT
BACKGROUND: Most organisms grow at temperatures from 20 to 50 degrees C but some prokaryotes, including Archaea and Bacteria, are capable of withstanding higher temperatures, from 60 to >100 degrees C. What makes these cells so resistant to heat? Their biomolecules must be sufficiently stable, especially proteins, to work under these extreme conditions, but the bases for thermostability remains elusive. RESULTS: The preferential usage of certain couples of amino acids and codons in thermal adaptation was investigated, by comparative proteome analysis, using 28 complete genomes from 18 mesophiles, 4 thermophiles, and 6 hyperthermophiles. In the hyperthermophiles proteomes, whenever the percent of Glu (E) and Lys (K) Increased, the percent of Gln (Q) and His (H) decreased, so that the E+K/Q+H ratio was > 4,5; in the mesophiles proteomes, it was < 2,5 and in the thermophiles an intermediary value was observed. The E+K/Q+H ratios for chaperonins, potentially thermostable proteins, were higher than their proteome ratios whereas, for DNA ligases, not necessarily thermostable, they followed the proteome ones. Analysis of codon usage revealed that hyperthermophiles preferred AGR codons for Arg in detriment of CGN codons, which were preferred by mesophiles. CONCLUSIONS: The results suggested that the E+K/Q+H ratio may provide a useful mark for distinguishing hyperthermophilic, thermophilic and mesophilic prokaryotes and that the high percent of the amino acid couple E+K, consistently associated to the low percent of the pair Q+H, could contribute to protein thermostability. Second, the preference for AGR codons for Arg was a signature of all hyperthermophilics so far analyzed.