Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Proc Natl Acad Sci U S A ; 120(49): e2310752120, 2023 Dec 05.
Article in English | MEDLINE | ID: mdl-38019864

ABSTRACT

The mechanisms generating novel genes and genetic information are poorly known, even for microRNA (miRNA) genes with an extremely constrained design. All miRNA primary transcripts need to fold into a stem-loop structure to yield short gene products ([Formula: see text]22 nt) that bind and repress their mRNA targets. While a substantial number of miRNA genes are ancient and highly conserved, short secondary structures coding for entirely novel miRNA genes have been shown to emerge in a lineage-specific manner. Template switching is a DNA-replication-related mutation mechanism that can introduce complex changes and generate perfect base pairing for entire hairpin structures in a single event. Here, we show that the template-switching mutations (TSMs) have participated in the emergence of over 6,000 suitable hairpin structures in the primate lineage to yield at least 18 new human miRNA genes, that is 26% of the miRNAs inferred to have arisen since the origin of primates. While the mechanism appears random, the TSM-generated miRNAs are enriched in introns where they can be expressed with their host genes. The high frequency of TSM events provides raw material for evolution. Being orders of magnitude faster than other mechanisms proposed for de novo creation of genes, TSM-generated miRNAs enable near-instant rewiring of genetic information and rapid adaptation to changing environments.


Subject(s)
MicroRNAs , Animals , Humans , MicroRNAs/metabolism , Primates/genetics , Introns , DNA Replication/genetics
2.
Proc Natl Acad Sci U S A ; 119(4)2022 01 25.
Article in English | MEDLINE | ID: mdl-35046021

ABSTRACT

The evolutionary origin of RNA stem structures and the preservation of their base pairing under a spontaneous and random mutation process have puzzled theoretical evolutionary biologists. DNA replication-related template switching is a mutation mechanism that creates reverse-complement copies of sequence regions within a genome by replicating briefly along either the complementary or nascent DNA strand. Depending on the relative positions and context of the four switch points, this process may produce a reverse-complement repeat capable of forming the stem of a perfect DNA hairpin or fix the base pairing of an existing stem. Template switching is typically thought to trigger large structural changes, and its possible role in the origin and evolution of RNA genes has not been studied. Here, we show that the reconstructed ancestral histories of RNA genes contain mutation patterns consistent with the DNA replication-related template switching. In addition to multibase compensatory mutations, the mechanism can explain complex sequence changes, although mutations breaking the structure rarely get fixed in evolution. Our results suggest a solution for the long-standing dilemma of RNA gene evolution and demonstrate how template switching can both create perfect stems with a single mutation event and help maintaining the stem structure over time. Interestingly, template switching also provides an elegant explanation for the asymmetric base pair frequencies within RNA stems.


Subject(s)
DNA Replication , DNA/chemistry , DNA/genetics , Inverted Repeat Sequences , Nucleic Acid Conformation , RNA/chemistry , Templates, Genetic , Base Pairing , Base Sequence , Mutation , RNA/genetics
3.
Microb Genom ; 7(9)2021 09.
Article in English | MEDLINE | ID: mdl-34542398

ABSTRACT

The nucleocytoplasmic large DNA viruses (NCLDVs) are a diverse group that currently contain the largest known virions and genomes, also called giant viruses. The first giant virus was isolated and described nearly 20 years ago. Their genome sizes were larger than for any other known virus at the time and it contained a number of genes that had not been previously described in any virus. The origin and evolution of these unusually complex viruses has been puzzling, and various mechanisms have been put forward to explain how some NCLDVs could have reached genome sizes and coding capacity overlapping with those of cellular microbes. Here we critically discuss the evidence and arguments on this topic. We have also updated and systematically reanalysed protein families of the NCLDVs to further study their origin and evolution. Our analyses further highlight the small number of widely shared genes and extreme genomic plasticity among NCLDVs that are shaped via combinations of gene duplications, deletions, lateral gene transfers and de novo creation of protein-coding genes. The dramatic expansions of the genome size and protein-coding gene capacity characteristic of some NCLDVs is now increasingly understood to be driven by environmental factors rather than reflecting relationships to an ancient common ancestor among a hypothetical cellular lineage. Thus, the evolution of NCLDVs is writ large viral, and their origin, like all other viral lineages, remains unknown.


Subject(s)
Biological Evolution , DNA Viruses/genetics , Genome, Viral , DNA Viruses/classification , DNA Viruses/physiology , Eukaryota/genetics , Eukaryota/virology , Genome Size , Host Microbial Interactions , Phylogeny , Viral Proteins/genetics
4.
Viruses ; 13(2)2021 02 17.
Article in English | MEDLINE | ID: mdl-33671332

ABSTRACT

RNA viruses are the fastest evolving known biological entities. Consequently, the sequence similarity between homologous viral proteins disappears quickly, limiting the usability of traditional sequence-based phylogenetic methods in the reconstruction of relationships and evolutionary history among RNA viruses. Protein structures, however, typically evolve more slowly than sequences, and structural similarity can still be evident, when no sequence similarity can be detected. Here, we used an automated structural comparison method, homologous structure finder, for comprehensive comparisons of viral RNA-dependent RNA polymerases (RdRps). We identified a common structural core of 231 residues for all the structurally characterized viral RdRps, covering segmented and non-segmented negative-sense, positive-sense, and double-stranded RNA viruses infecting both prokaryotic and eukaryotic hosts. The grouping and branching of the viral RdRps in the structure-based phylogenetic tree follow their functional differentiation. The RdRps using protein primer, RNA primer, or self-priming mechanisms have evolved independently of each other, and the RdRps cluster into two large branches based on the used transcription mechanism. The structure-based distance tree presented here follows the recently established RdRp-based RNA virus classification at genus, subfamily, family, order, class and subphylum ranks. However, the topology of our phylogenetic tree suggests an alternative phylum level organization.


Subject(s)
RNA Viruses/enzymology , RNA-Dependent RNA Polymerase/chemistry , Viral Proteins/chemistry , Models, Molecular , Phylogeny , Protein Conformation, alpha-Helical , Protein Domains , RNA Viruses/chemistry , RNA Viruses/classification , RNA Viruses/genetics , RNA-Dependent RNA Polymerase/genetics , RNA-Dependent RNA Polymerase/metabolism , Viral Proteins/genetics , Viral Proteins/metabolism
5.
PLoS One ; 14(5): e0216659, 2019.
Article in English | MEDLINE | ID: mdl-31100077

ABSTRACT

Specific cleavage of proteins by proteases is essential for several cellular, physiological, and viral processes. Chymotrypsin-related proteases that form the PA clan in the MEROPS classification of proteases is one of the largest and most diverse group of proteases. The PA clan comprises serine proteases from bacteria, eukaryotes, archaea, and viruses and chymotrypsin-related cysteine proteases from positive-strand RNA viruses. Despite low amino acid sequence identity, all PA clan proteases share a conserved double ß-barrel structure. Using an automated structure-based hierarchical clustering method, we identified a common structural core of 72 amino acid residues for 143 PA clan proteases that represent 12 protein families and 11 subfamilies. The identified core is located around the catalytic site between the two ß-barrels and resembles the structures of the smallest PA clan proteases. We constructed a structure-based distance tree derived from the properties of the identified common core. Our structure-based analyses support the current classification of these proteases at the subfamily level and largely at the family level. Structural alignment and structure-based distance trees could thus be used for directing objective classification of PA clan proteases and to strengthen their higher order classification. Our results also indicate that the PA clan proteases of positive-strand RNA viruses are related to cellular heat-shock proteases, which suggests that the exchange of protease genes between viruses and cells might have occurred more than once.


Subject(s)
Chymotrypsin/classification , Chymotrypsin/genetics , Chymotrypsin/ultrastructure , Amino Acid Sequence/genetics , Binding Sites , Catalytic Domain , Peptide Hydrolases/classification , Peptide Hydrolases/ultrastructure , Sequence Homology, Amino Acid , Structure-Activity Relationship
6.
Mol Biol Evol ; 33(7): 1697-710, 2016 07.
Article in English | MEDLINE | ID: mdl-26931141

ABSTRACT

Identification of relationships among protein families or superfamilies is a challenge. However, functionally essential protein regions typically retain structural integrity, even when the corresponding protein sequences evolve. Consequently, comparison of protein structures enables deeper phylogenetic analyses than achievable through the use of sequence information only. Here, we focus on a group of distantly related viral and cellular enzymes involved in nucleic acid or nucleotide processing and synthesis. All these enzymes share an apparently similar protein fold at their active site, which resembles the palm subdomain of the right-hand-shaped polymerases. Using a structure-based hierarchical clustering method, we identified a common structural core of 36 equivalent residues for this functionally diverse group of enzymes, representing five protein superfamilies. Based on the properties of these 36 residues, we deduced a structural distance-based tree in which the proteins were accurately clustered according to the established family classification. Within this tree, the enzymes catalyzing genomic nucleic acid replication or transcription were separated from those performing supplementary nucleic acid or nucleotide processing functions. In addition, we found that the family Y DNA polymerases are structurally more closely related to the nucleotide cyclase superfamily members than to the other members of the DNA/RNA polymerase superfamily, and these enzymes share 88 equivalent residues comprising a Β: 1- Α: 1- Α: 2- Β: 2- Β: 3- Α: 3- Β: 4- Α: 4- Β: 5 fold. The results highlight the power of structure-based hierarchical clustering in identifying remote evolutionary relationships. Furthermore, our study implies that a protein substructure of only three-dozen residues can contain a substantial amount of information on the evolutionary history of proteins.


Subject(s)
Proteins/chemistry , Proteins/genetics , Sequence Analysis, Protein/methods , Structural Homology, Protein , Amino Acid Sequence , Catalytic Domain , Cluster Analysis , Evolution, Molecular , Genomics , Models, Molecular , Phylogeny , Sequence Alignment/methods , Structure-Activity Relationship
7.
Mol Biol Evol ; 31(10): 2741-52, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25063440

ABSTRACT

Polymerases are essential for life, being responsible for replication, transcription, and the repair of nucleic acid molecules. Those that share a right-hand-shaped fold and catalytic site structurally similar to the DNA polymerase I of Escherichia coli may catalyze RNA- or DNA-dependent RNA polymerization, reverse transcription, or DNA replication in eukarya, archaea, bacteria, and their viruses. We have applied novel computational methods for structure-based clustering and phylogenetic analyses of this functionally diverse polymerase superfamily, which currently comprises six families. We identified a structural core common to all right-handed polymerases, composed of 57 amino acid residues, harboring two positionally and chemically conserved residues, the catalytic aspartates. The structural conservation within each of the six families is considerable, for example, the structural core shared by family Y DNA polymerases covers over 90% of the polymerase domain of the Sulfolobus solfataricus Dpo4. Our phylogenetic analyses propose an early separation of RNA-dependent polymerases that use primers from those that are primer-independent. Furthermore, the exchange of polymerase genes between viruses and their hosts is evident. Because of this horizontal gene transfer, the phylogeny of polymerases does not always reflect the evolutionary history of the corresponding organisms.


Subject(s)
Aspartic Acid/genetics , Bacteria/enzymology , Bacterial Proteins/chemistry , Computational Biology/methods , DNA-Directed DNA Polymerase/chemistry , Amino Acid Sequence , Automation, Laboratory/methods , Bacteria/genetics , Bacterial Proteins/genetics , Catalytic Domain , Conserved Sequence , DNA-Directed DNA Polymerase/genetics , Evolution, Molecular , Gene Transfer, Horizontal , Models, Molecular , Phylogeny , Viral Proteins/chemistry , Viral Proteins/metabolism , Viruses/enzymology
8.
PLoS One ; 7(7): e40581, 2012.
Article in English | MEDLINE | ID: mdl-22792374

ABSTRACT

A high-affinity divalent cation-binding site located proximal to the catalytic center has been identified in several RNA-dependent RNA polymerases (RdRps), but the characteristics of such a site have not been systematically studied. Here, all available polymerase structures that follow the hand-like structural motif were screened for the presence of a divalent cation close to the catalytic site but distinct from catalytic metal ions. Such non-catalytic ions were found in all RNA virus families for which there were high-resolution RdRp structures available. Bound ions were always located in structurally similar locations at an approximate 6-Å distance from the catalytic site. Furthermore, the second aspartate residue in the highly conserved GDD sequence was found to be involved in the coordination of the bound ion in all viral RdRps studied. These results suggest that a non-catalytic ion-binding site is conserved across positive-sense, single-stranded, and double-stranded RNA viruses. Interestingly, a non-catalytic ion was also observed in a similar position in the reverse transcriptase of the human immunodeficiency virus. Moreover, two members of the DNA-dependent DNA polymerase B family displayed an ion at a comparable distance from the catalytic site, but the position was clearly distinct from the non-catalytic ion-binding sites of RdRps.


Subject(s)
Ions/chemistry , RNA-Dependent RNA Polymerase/chemistry , Amino Acid Sequence , Binding Sites , Catalysis , Catalytic Domain , Cations/chemistry , Cations/metabolism , Conserved Sequence , DNA-Directed DNA Polymerase/chemistry , DNA-Directed DNA Polymerase/metabolism , Humans , Ions/metabolism , Molecular Docking Simulation , Molecular Sequence Data , Protein Conformation , RNA Viruses/enzymology , RNA-Dependent RNA Polymerase/metabolism , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL
...