Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Protein Sci ; 33(3): e4844, 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-38009704

ABSTRACT

Aminoacyl-tRNA synthetases (aaRSs) establish the genetic code. Each aaRS covalently links a given canonical amino acid to a cognate set of tRNA isoacceptors. Glycyl tRNA aminoacylation is unusual in that it is catalyzed by different aaRSs in different lineages of the Tree of Life. We have investigated the phylogenetic distribution and evolutionary history of bacterial glycyl tRNA synthetase (bacGlyRS). This enzyme is found in early diverging bacterial phyla such as Firmicutes, Acidobacteria, and Proteobacteria, but not in archaea or eukarya. We observe relationships between each of six domains of bacGlyRS and six domains of four different RNA-modifying proteins. Component domains of bacGlyRS show common ancestry with (i) the catalytic domain of class II tRNA synthetases; (ii) the HD domain of the bacterial RNase Y; (iii) the body and tail domains of the archaeal CCA-adding enzyme; (iv) the anti-codon binding domain of the arginyl tRNA synthetase; and (v) a previously unrecognized domain that we call ATL (Ancient tRNA latch). The ATL domain has been found thus far only in bacGlyRS and in the universal alanyl tRNA synthetase (uniAlaRS). Further, the catalytic domain of bacGlyRS is more closely related to the catalytic domain of uniAlaRS than to any other aminoacyl tRNA synthetase. The combined results suggest that the ATL and catalytic domains of these two enzymes are ancestral to bacGlyRS and uniAlaRS, which emerged from common protein ancestors by bricolage, stepwise accumulation of protein domains, before the last universal common ancestor of life.

2.
Proc Natl Acad Sci U S A ; 119(52): e2207897119, 2022 12 27.
Article in English | MEDLINE | ID: mdl-36534803

ABSTRACT

Mechanisms of emergence and divergence of protein folds pose central questions in biological sciences. Incremental mutation and stepwise adaptation explain relationships between topologically similar protein folds. However, the universe of folds is diverse and riotous, suggesting more potent and creative forces are at play. Sequence and structure similarity are observed between distinct folds, indicating that proteins with distinct folds may share common ancestry. We found evidence of common ancestry between three distinct ß-barrel folds: Scr kinase family homology (SH3), oligonucleotide/oligosaccharide-binding (OB), and cradle loop barrel (CLB). The data suggest a mechanism of fold evolution that interconverts SH3, OB, and CLB. This mechanism, which we call creative destruction, can be generalized to explain many examples of fold evolution including circular permutation. In creative destruction, an open reading frame duplicates or otherwise merges with another to produce a fused polypeptide. A merger forces two ancestral domains into a new sequence and spatial context. The fused polypeptide can explore folding landscapes that are inaccessible to either of the independent ancestral domains. However, the folding landscapes of the fused polypeptide are not fully independent of those of the ancestral domains. Creative destruction is thus partially conservative; a daughter fold inherits some motifs from ancestral folds. After merger and refolding, adaptive processes such as mutation and loss of extraneous segments optimize the new daughter fold. This model has application in disease states characterized by genetic instability. Fused proteins observed in cancer cells are likely to experience remodeled folding landscapes and realize altered folds, conferring new or altered functions.


Subject(s)
Protein Folding , Proteins , Proteins/chemistry , Oligonucleotides/metabolism , Biophysical Phenomena , Mutation
3.
PLoS Comput Biol ; 17(10): e1009541, 2021 10.
Article in English | MEDLINE | ID: mdl-34714829

ABSTRACT

We have developed the program TwinCons, to detect noisy signals of deep ancestry of proteins or nucleic acids. As input, the program uses a composite alignment containing pre-defined groups, and mathematically determines a 'cost' of transforming one group to the other at each position of the alignment. The output distinguishes conserved, variable and signature positions. A signature is conserved within groups but differs between groups. The method automatically detects continuous characteristic stretches (segments) within alignments. TwinCons provides a convenient representation of conserved, variable and signature positions as a single score, enabling the structural mapping and visualization of these characteristics. Structure is more conserved than sequence. TwinCons highlights alternative sequences of conserved structures. Using TwinCons, we detected highly similar segments between proteins from the translation and transcription systems. TwinCons detects conserved residues within regions of high functional importance for the ribosomal RNA (rRNA) and demonstrates that signatures are not confined to specific regions but are distributed across the rRNA structure. The ability to evaluate both nucleic acid and protein alignments allows TwinCons to be used in combined sequence and structural analysis of signatures and conservation in rRNA and in ribosomal proteins (rProteins). TwinCons detects a strong sequence conservation signal between bacterial and archaeal rProteins related by circular permutation. This conserved sequence is structurally colocalized with conserved rRNA, indicated by TwinCons scores of rRNA alignments of bacterial and archaeal groups. This combined analysis revealed deep co-evolution of rRNA and rProtein buried within the deepest branching points in the tree of life.


Subject(s)
Conserved Sequence/genetics , Deep Learning , RNA, Ribosomal/genetics , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Archaeal Proteins/chemistry , Archaeal Proteins/genetics , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Evolution, Molecular , Metagenomics
4.
Mol Biol Evol ; 38(11): 5134-5143, 2021 10 27.
Article in English | MEDLINE | ID: mdl-34383917

ABSTRACT

SH3 and OB are the simplest, oldest, and most common protein domains within the translation system. SH3 and OB domains are ß-barrels that are structurally similar but are topologically distinct. To transform an OB domain to a SH3 domain, ß-strands must be permuted in a multistep and evolutionarily implausible mechanism. Here, we explored relationships between SH3 and OB domains of ribosomal proteins, initiation, and elongation factors using a combined sequence- and structure-based approach. We detect a common core of SH3 and OB domains, as a region of significant structure and sequence similarity. The common core contains four ß-strands and a loop, but omits the fifth ß-strand, which is variable and is absent from some OB and SH3 domain proteins. The structure of the common core immediately suggests a simple permutation mechanism for interconversion between SH3 and OB domains, which appear to share an ancestor. The OB domain was formed by duplication and adaptation of the SH3 domain core, or vice versa, in a simple and probable transformation. By employing the folding algorithm AlphaFold2, we demonstrated that an ancestral reconstruction of a permuted SH3 sequence folds into an OB structure, and an ancestral reconstruction of a permuted OB sequence folds into a SH3 structure. The tandem SH3 and OB domains in the universal ribosomal protein uL2 share a common ancestor, suggesting that the divergence of these two domains occurred before the last universal common ancestor.


Subject(s)
Ribosomal Proteins , src Homology Domains , Amino Acid Sequence , Models, Molecular , Ribosomal Proteins/genetics , Sequence Alignment , src Homology Domains/genetics
5.
Nucleic Acids Res ; 49(W1): W578-W588, 2021 07 02.
Article in English | MEDLINE | ID: mdl-33999189

ABSTRACT

ProteoVision is a web server designed to explore protein structure and evolution through simultaneous visualization of multiple sequence alignments, topology diagrams and 3D structures. Starting with a multiple sequence alignment, ProteoVision computes conservation scores and a variety of physicochemical properties and simultaneously maps and visualizes alignments and other data on multiple levels of representation. The web server calculates and displays frequencies of amino acids. ProteoVision is optimized for ribosomal proteins but is applicable to analysis of any protein. ProteoVision handles internally generated and user uploaded alignments and connects them with a selected structure, found in the PDB or uploaded by the user. It can generate de novo topology diagrams from three-dimensional structures. All displayed data is interactive and can be saved in various formats as publication quality images or external datasets or PyMol Scripts. ProteoVision enables detailed study of protein fragments defined by Evolutionary Classification of protein Domains (ECOD) classification. ProteoVision is available at http://proteovision.chemistry.gatech.edu/.


Subject(s)
Ribosomal Proteins/chemistry , Software , Acetolactate Synthase/chemistry , Bacterial Proteins/chemistry , Internet , Models, Molecular , Peptide Elongation Factor Tu/chemistry , Protein Conformation , Sequence Alignment
6.
Proteins ; 88(9): 1169-1179, 2020 09.
Article in English | MEDLINE | ID: mdl-32112578

ABSTRACT

Internal structure similarity in proteins can be observed at the domain and subdomain levels. From an evolutionary perspective, structurally similar elements may arise divergently by gene duplication and fusion events but may also be the product of convergent evolution under physicochemical constraints. The characterization of proteins that contain repeated structural elements has implications for many fields of protein science including protein domain evolution, structure classification, structure prediction, and protein engineering. FiRES (Find Repeated Elements in Structure) is an algorithm that relies on a topology-independent structure alignment method to identify repeating elements in protein structure. FiRES was tested against two hand curated databases of protein repeats: MALIDUP, for very divergent duplicated domains; and RepeatsDB for short tandem repeats. The performance of FiRES was compared to that of lalign, RADAR, HHrepID, CE-symm, ReUPred, and Swelfe. FiRES was the method that most accurately detected proteins either with duplicated domains (accuracy = 0.86) or with multiple repeated units (accuracy = 0.92). FiRES is a new methodology for the discovery of proteins containing structurally similar elements. The FiRES web server is publicly available at http://fires.ifc.unam.mx. The scripts, results, and benchmarks from this study can be downloaded from https://github.com/Claualvarez/fires.


Subject(s)
Algorithms , Proteins/chemistry , Software , Structural Homology, Protein , Amino Acid Sequence , Benchmarking , Databases, Protein , Evolution, Molecular , Gene Duplication , Protein Structure, Secondary
7.
Protein Sci ; 27(4): 848-860, 2018 04.
Article in English | MEDLINE | ID: mdl-29330894

ABSTRACT

Hemerythrin-like proteins have generally been studied for their ability to reversibly bind oxygen through their binuclear nonheme iron centers. However, in recent years, it has become increasingly evident that some members of the hemerythrin-like superfamily also participate in many other biological processes. For instance, the binuclear nonheme iron site of YtfE, a hemerythrin-like protein involved in the repair of iron centers in Escherichia coli, catalyzes the reduction of nitric oxide to nitrous oxide, and the human F-box/LRR-repeat protein 5, which contains a hemerythrin-like domain, is involved in intracellular iron homeostasis. Furthermore, structural data on hemerythrin-like domains from two proteins of unknown function, PF0695 from Pyrococcus furiosus and NMB1532 from Neisseria meningitidis, show that the cation-binding sites, typical of hemerythrin, can be absent or be occupied by metal ions other than iron. To systematically investigate this functional and structural diversity of the hemerythrin-like superfamily, we have collected hemerythrin-like sequences from a database comprising fully sequenced proteomes and generated a cluster map based on their all-against-all pairwise sequence similarity. Our results show that the hemerythrin-like superfamily comprises a large number of protein families which can be classified into three broad groups on the basis of their cation-coordinating residues: (a) signal-transduction and oxygen-carrier hemerythrins (H-HxxxE-HxxxH-HxxxxD); (b) hemerythrin-like (H-HxxxE-H-HxxxE); and, (c) metazoan F-box proteins (H-HExxE-H-HxxxE). Interestingly, all but two hemerythrin-like families exhibit internal sequence and structural symmetry, suggesting that a duplication event may have led to the origin of the hemerythrin domain.


Subject(s)
Evolution, Molecular , Hemerythrin/chemistry , Nonheme Iron Proteins/chemistry , Nonheme Iron Proteins/metabolism , Amino Acid Motifs , Cluster Analysis , Hemerythrin/metabolism , Oxygen/metabolism , Phylogeny , Protein Domains , Structural Homology, Protein
8.
PLoS One ; 11(6): e0157904, 2016.
Article in English | MEDLINE | ID: mdl-27336621

ABSTRACT

BACKGROUND: The evolution of oxygenic photosynthesis during Precambrian times entailed the diversification of strategies minimizing reactive oxygen species-associated damage. Four families of oxygen-carrier proteins (hemoglobin, hemerythrin and the two non-homologous families of arthropodan and molluscan hemocyanins) are known to have evolved independently the capacity to bind oxygen reversibly, providing cells with strategies to cope with the evolutionary pressure of oxygen accumulation. Oxygen-binding hemerythrin was first studied in marine invertebrates but further research has made it clear that it is present in the three domains of life, strongly suggesting that its origin predated the emergence of eukaryotes. RESULTS: Oxygen-binding hemerythrins are a monophyletic sub-group of the hemerythrin/HHE (histidine, histidine, glutamic acid) cation-binding domain. Oxygen-binding hemerythrin homologs were unambiguously identified in 367/2236 bacterial, 21/150 archaeal and 4/135 eukaryotic genomes. Overall, oxygen-binding hemerythrin homologues were found in the same proportion as single-domain and as long protein sequences. The associated functions of protein domains in long hemerythrin sequences can be classified in three major groups: signal transduction, phosphorelay response regulation, and protein binding. This suggests that in many organisms the reversible oxygen-binding capacity was incorporated in signaling pathways. A maximum-likelihood tree of oxygen-binding hemerythrin homologues revealed a complex evolutionary history in which lateral gene transfer, duplications and gene losses appear to have played an important role. CONCLUSIONS: Hemerythrin is an ancient protein domain with a complex evolutionary history. The distinctive iron-binding coordination site of oxygen-binding hemerythrins evolved first in prokaryotes, very likely prior to the divergence of Firmicutes and Proteobacteria, and spread into many bacterial, archaeal and eukaryotic species. The later evolution of the oxygen-binding hemerythrin domain in both prokaryotes and eukaryotes led to a wide variety of functions, ranging from protection against oxidative damage in anaerobic and microaerophilic organisms, to oxygen supplying to particular enzymes and pathways in aerobic and facultative species.


Subject(s)
Evolution, Molecular , Hemerythrin/genetics , Hemerythrin/metabolism , Oxygen/metabolism , Protein Interaction Domains and Motifs , Amino Acid Sequence , Cluster Analysis , Gene Dosage , Genome, Bacterial , Hemerythrin/chemistry , Hemerythrin/classification , Phylogeny , Protein Binding
9.
Orig Life Evol Biosph ; 43(4-5): 363-75, 2013 Oct.
Article in English | MEDLINE | ID: mdl-24013929

ABSTRACT

The absence of the hydrophobic norvaline and norleucine in the inventory of protein amino acids is readdressed. The well-documented intracellular accumulation of these two amino acids results from the low-substrate specificity of the branched-chain amino acid biosynthetic enzymes that act over a number of related α-ketoacids. The lack of absolute substrate specificity of leucyl-tRNA synthase leads to a mischarged norvalyl-tRNA(Leu) that evades the translational proofreading activities and produces norvaline-containing proteins, (cf. Apostol et al. J Biol Chem 272:28980-28988, 1997). A similar situation explains the presence of minute but detectable amounts of norleucine in place of methionine. Since with few exceptions both leucine and methionine are rarely found in the catalytic sites of most enzymes, their substitution by norvaline and norleucine, respectively, would have not been strongly hindered in small structurally simple catalytic polypeptides during the early stages of biological evolution. The report that down-shifts of free oxygen lead to high levels of intracellular accumulation of pyruvate and the subsequent biosynthesis of norvaline (Soini et al. Microb Cell Factories 7:30, 2008) demonstrates the biochemical and metabolic consequences of the development of a highly oxidizing environment. The results discussed here also suggest that a broader definition of biomarkers in the search for extraterrestrial life may be required.


Subject(s)
Evolution, Chemical , Norleucine/chemistry , Valine/analogs & derivatives , Origin of Life , Valine/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...