Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
BMC Evol Biol ; 1: 8, 2001 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-11734060

RESUMEN

BACKGROUND: The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. RESULTS: Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the low-GC Gram-positive bacteria at a deeper tree node. These new groupings of bacteria were supported by the analysis of alternative topologies in the concatenated ribosomal protein tree using the Kishino-Hasegawa test and by a census of the topologies of 132 individual groups of orthologous proteins. Additionally, the results of this analysis put into question the sister-group relationship between the two major archaeal groups, Euryarchaeota and Crenarchaeota, and suggest instead that Euryarchaeota might be a paraphyletic group with respect to Crenarchaeota. CONCLUSIONS: We conclude that, the extensive horizontal gene flow and lineage-specific gene loss notwithstanding, extension of phylogenetic analysis to the genome scale has the potential of uncovering deep evolutionary relationships between prokaryotic lineages.


Asunto(s)
Bacterias/clasificación , Bacterias/genética , Evolución Molecular , Genoma Bacteriano , Genómica/métodos , Filogenia , Secuencia Conservada/genética , Orden Génico/genética , Transferencia de Gen Horizontal , Genes Arqueales/genética , Genes Bacterianos/genética , Genoma Arqueal , Funciones de Verosimilitud , Células Procariotas/metabolismo , Proteínas Ribosómicas/genética , Alineación de Secuencia , Especificidad de la Especie
2.
J Struct Biol ; 134(2-3): 167-85, 2001.
Artículo en Inglés | MEDLINE | ID: mdl-11551177

RESUMEN

Typically, protein spatial structures are more conserved in evolution than amino acid sequences. However, the recent explosion of sequence and structure information accompanied by the development of powerful computational methods led to the accumulation of examples of homologous proteins with globally distinct structures. Significant sequence conservation, local structural resemblance, and functional similarity strongly indicate evolutionary relationships between these proteins despite pronounced structural differences at the fold level. Several mechanisms such as insertions/deletions/substitutions, circular permutations, and rearrangements in beta-sheet topologies account for the majority of detected structural irregularities. The existence of evolutionarily related proteins that possess different folds brings new challenges to the homology modeling techniques and the structure classification strategies and offers new opportunities for protein design in experimental studies.


Asunto(s)
Evolución Molecular , Pliegue de Proteína , Proteínas/química , Secuencia de Aminoácidos , Animales , Humanos , Datos de Secuencia Molecular , Conformación Proteica , Estructura Terciaria de Proteína/genética , Homología de Secuencia de Aminoácido
3.
Bioinformatics ; 17(8): 700-12, 2001 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-11524371

RESUMEN

MOTIVATION: Amino acid sequence alignments are widely used in the analysis of protein structure, function and evolutionary relationships. Proteins within a superfamily usually share the same fold and possess related functions. These structural and functional constraints are reflected in the alignment conservation patterns. Positions of functional and/or structural importance tend to be more conserved. Conserved positions are usually clustered in distinct motifs surrounded by sequence segments of low conservation. Poorly conserved regions might also arise from the imperfections in multiple alignment algorithms and thus indicate possible alignment errors. Quantification of conservation by attributing a conservation index to each aligned position makes motif detection more convenient. Mapping these conservation indices onto a protein spatial structure helps to visualize spatial conservation features of the molecule and to predict functionally and/or structurally important sites. Analysis of conservation indices could be a useful tool in detection of potentially misaligned regions and will aid in improvement of multiple alignments. RESULTS: We developed a program to calculate a conservation index at each position in a multiple sequence alignment using several methods. Namely, amino acid frequencies at each position are estimated and the conservation index is calculated from these frequencies. We utilize both unweighted frequencies and frequencies weighted using two different strategies. Three conceptually different approaches (entropy-based, variance-based and matrix score-based) are implemented in the algorithm to define the conservation index. Calculating conservation indices for 35522 positions in 284 alignments from SMART database we demonstrate that different methods result in highly correlated (correlation coefficient more than 0.85) conservation indices. Conservation indices show statistically significant correlation between sequentially adjacent positions i and i + j, where j < 13, and averaging of the indices over the window of three positions is optimal for motif detection. Positions with gaps display substantially lower conservation properties. We compare conservation properties of the SMART alignments or FSSP structural alignments to those of the ClustalW alignments. The results suggest that conservation indices should be a valuable tool of alignment quality assessment and might be used as an objective function for refinement of multiple alignments. AVAILABILITY: The C code of the AL2CO program and its pre-compiled versions for several platforms as well as the details of the analysis are freely available at ftp://iole.swmed.edu/pub/al2co/.


Asunto(s)
Proteínas/química , Proteínas/genética , Alineación de Secuencia/estadística & datos numéricos , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Biología Computacional , Secuencia Conservada , Entropía , Modelos Moleculares , Conformación Proteica , Control de Calidad , Alineación de Secuencia/normas
4.
J Biol Chem ; 276(45): 42099-107, 2001 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-11527962

RESUMEN

gamma-Glultamylcysteine synthetase (gamma-GCS) catalyzes the first step in the de novo biosynthesis of glutathione. In trypanosomes, glutathione is conjugated to spermidine to form a unique cofactor termed trypanothione, an essential cofactor for the maintenance of redox balance in the cell. Using extensive similarity searches and sequence motif analysis we detected homology between gamma-GCS and glutamine synthetase (GS), allowing these proteins to be unified into a superfamily of carboxylate-amine/ammonia ligases. The structure of gamma-GCS, which was previously poorly understood, was modeled using the known structure of GS. Two metal-binding sites, each ligated by three conserved active site residues (n1: Glu-55, Glu-93, Glu-100; and n2: Glu-53, Gln-321, and Glu-489), are predicted to form the catalytic center of the active site, where the n1 site is expected to bind free metal and the n2 site to interact with MgATP. To elucidate the roles of the metals and their ligands in catalysis, these six residues were mutated to alanine in the Trypanosoma brucei enzyme. All mutations caused a substantial loss of activity. Most notably, E93A was able to catalyze the l-Glu-dependent ATP hydrolysis but not the peptide bond ligation, suggesting that the n1 metal plays an important role in positioning l-Glu for the reaction chemistry. The apparent K(m) values for ATP were increased for both the E489A and Q321A mutant enzymes, consistent with a role for the n2 metal in ATP binding and phosphoryl transfer. Furthermore, the apparent K(d) values for activation of E489A and Q321A by free Mg(2+) increased. Finally, substitution of Mn(2+) for Mg(2+) in the reaction rescued the catalytic deficits caused by both mutations, demonstrating that the nature of the metal ligands plays an important role in metal specificity.


Asunto(s)
Glutamato-Cisteína Ligasa/química , Magnesio/farmacología , Manganeso/farmacología , Secuencia de Aminoácidos , Sitios de Unión , Glutamato-Amoníaco Ligasa/química , Cinética , Datos de Secuencia Molecular
5.
Trends Biochem Sci ; 26(5): 275-7, 2001 May.
Artículo en Inglés | MEDLINE | ID: mdl-11343912

RESUMEN

In this article, a novel, large and diverse superfamily of putative membrane-bound proteins that includes the type II CAAX prenyl endopeptidases is described. The majority of the members of this superfamily are hypothetical proteins from bacteria and plants. Analysis of the conserved motifs, combined with available experimental data, suggests that these proteins are putative metal-dependent proteases that are potentially involved in protein and/or peptide modification and secretion.


Asunto(s)
Membrana Celular/enzimología , Endopeptidasas/química , Endopeptidasas/clasificación , Metaloendopeptidasas/química , Metaloendopeptidasas/clasificación , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Animales , Humanos , Datos de Secuencia Molecular , Familia de Multigenes , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido
6.
Nucleic Acids Res ; 29(8): 1703-14, 2001 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-11292843

RESUMEN

Detection of similarity is particularly difficult for small proteins and thus connections between many of them remain unnoticed. Structure and sequence analysis of several metal-binding proteins reveals unexpected similarities in structural domains classified as different protein folds in SCOP and suggests unification of seven folds that belong to two protein classes. The common motif, termed treble clef finger in this study, forms the protein structural core and is 25-45 residues long. The treble clef motif is assembled around the central zinc ion and consists of a zinc knuckle, loop, beta-hairpin and an alpha-helix. The knuckle and the first turn of the helix each incorporate two zinc ligands. Treble clef domains constitute the core of many structures such as ribosomal proteins L24E and S14, RING fingers, protein kinase cysteine-rich domains, nuclear receptor-like fingers, LIM domains, phosphatidylinositol-3-phosphate-binding domains and His-Me finger endonucleases. The treble clef finger is a uniquely versatile motif adaptable for various functions. This small domain with a 25 residue structural core can accommodate eight different metal-binding sites and can have many types of functions from binding of nucleic acids, proteins and small molecules, to catalysis of phosphodiester bond hydrolysis. Treble clef motifs are frequently incorporated in larger structures or occur in doublets. Present analysis suggests that the treble clef motif defines a distinct structural fold found in proteins with diverse functional properties and forms one of the major zinc finger groups.


Asunto(s)
Proteínas/química , Proteínas/metabolismo , Dedos de Zinc , Zinc/metabolismo , Factores de Ribosilacion-ADP/química , Factores de Ribosilacion-ADP/metabolismo , Secuencia de Aminoácidos , Sitios de Unión , Cisteína/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , Endonucleasas/química , Endonucleasas/metabolismo , Proteínas Activadoras de GTPasa/química , Proteínas Activadoras de GTPasa/metabolismo , Modelos Moleculares , Datos de Secuencia Molecular , Fosfatos de Fosfatidilinositol/metabolismo , Pliegue de Proteína , Proteínas Quinasas/química , Proteínas Quinasas/metabolismo , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Receptores Citoplasmáticos y Nucleares/química , Proteínas Ribosómicas/química , Proteínas Ribosómicas/metabolismo , Alineación de Secuencia , Proteínas Smad , Relación Estructura-Actividad , Transactivadores/química , Transactivadores/metabolismo
7.
J Mol Biol ; 307(1): 31-7, 2001 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-11243801

RESUMEN

Smad proteins are eukarytic transcription regulators in the TGF-beta signaling cascade. Using a combination of sequence and structure-based analyses, we argue that MH1 domain of Smad is homologous to the diverse His-Me finger endonuclease family enzymes. The similarity is particularly extensive with the I-PpoI endonuclease. In addition to the global fold similarities, both proteins possess a conserved motif of three cysteine residues and one histidine residue which form a zinc-binding site in I-PpoI. Sequence and structure conservation in the motif region strongly suggest that MH1 domain may also incorporate a metal ion in its structural core. MH1 of Smad3 and I-PpoI exhibit similar nucleic acid binding mode and interact with DNA major groove through an antiparallel beta-sheet. MH1 is an example of transcription regulator derived from the ancient enzymatic domain that lost its catalytic activity but retained DNA-binding sites.


Asunto(s)
Proteínas de Unión al ADN/química , Endodesoxirribonucleasas/química , Transactivadores/química , Secuencia de Aminoácidos , Dominio Catalítico , Proteínas de Unión al ADN/metabolismo , Endodesoxirribonucleasas/metabolismo , Humanos , Modelos Moleculares , Pliegue de Proteína , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido , Proteínas Smad , Proteína smad3 , Transactivadores/metabolismo , Zinc/metabolismo
8.
Nucleic Acids Res ; 29(3): 638-43, 2001 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-11160884

RESUMEN

The K homology (KH) module is a widespread RNA-binding motif that has been detected by sequence similarity searches in such proteins as heterogeneous nuclear ribonucleoprotein K (hnRNP K) and ribosomal protein S3. Analysis of spatial structures of KH domains in hnRNP K and S3 reveals that they are topologically dissimilar and thus belong to different protein folds. Thus KH motif proteins provide a rare example of protein domains that share significant sequence similarity in the motif regions but possess globally distinct structures. The two distinct topologies might have arisen from an ancestral KH motif protein by N- and C-terminal extensions, or one of the existing topologies may have evolved from the other by extension, displacement and deletion. C-terminal extension (deletion) requires ss-sheet rearrangement through the insertion (removal) of a ss-strand in a manner similar to that observed in serine protease inhibitors serpins. Current analysis offers a new look on how proteins can change fold in the course of evolution.


Asunto(s)
Proteínas Portadoras , Estructura Terciaria de Proteína , Ribonucleoproteínas/genética , Proteínas Ribosómicas/genética , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Animales , Bases de Datos Factuales , Evolución Molecular , Ribonucleoproteína Heterogénea-Nuclear Grupo K , Humanos , Datos de Secuencia Molecular , Conformación Proteica , Pliegue de Proteína , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/genética , Ribonucleoproteínas/química , Proteínas Ribosómicas/química , Alineación de Secuencia , Homología de Secuencia de Aminoácido
9.
Proteins ; 42(2): 210-6, 2001 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-11119645

RESUMEN

The GGDEF domain is detected in many prokaryotic proteins, most of which are of unknown function. Several bacteria carry 12-22 different GGDEF homologues in their genomes. Conducting extensive profile-based searches, we detect statistically supported sequence similarity between GGDEF domain and adenylyl cyclase catalytic domain. From this homology, we deduce that the prokaryotic GGDEF domain is a regulatory enzyme involved in nucleotide cyclization, with the fold similar to that of the eukaryotic cyclase catalytic domain. This prediction correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. Domain architecture analysis shows that GGDEF is typically present in multidomain proteins containing regulatory domains of signaling pathways or protein-protein interaction modules. Evolutionary tree analysis indicates that GGDEF/cyclase superfamily forms a large diversified cluster of orthologous proteins present in bacteria, archaea, and eukaryotes. Proteins 2001;42:210-216.


Asunto(s)
Acetobacter/enzimología , Adenilil Ciclasas/química , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Animales , Proteínas de Escherichia coli , Modelos Moleculares , Datos de Secuencia Molecular , Liasas de Fósforo-Oxígeno/química , Conformación Proteica , Estructura Terciaria de Proteína , Saccharomyces cerevisiae/enzimología , Homología de Secuencia de Aminoácido , Transducción de Señal
10.
Proteins ; 42(2): 230-6, 2001 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-11119647

RESUMEN

Discovering distant evolutionary relationships between proteins requires detecting subtle similarities. Here we use a combination of sequence and structure analysis to show that the C-terminal domain of Escherichia coli HPII catalase with available spatial structure is a divergent member of the type I glutamine amidotransferase (GAT) superfamily. GAT-containing proteins include many biosynthetic enzymes such as E. coli carbamoyl phosphate synthetase and anthranilate synthase. Typical GAT domains have Rossmann fold-like topology and possess a catalytic triad similar to that of proteases. The C-terminal domain of HPII catalase has the GAT Rossmann fold but lacks the triad and therefore loses enzymatic activity. In addition, we detect significant sequence similarity between thiJ domains, some of which are known to have protease activity, and typical GAT proteins. Evolutionary tree analysis of the entire GAT superfamily indicates that the HPII catalase is more closely related to thiJ domains than to classical GAT domains and is likely to have evolved from a thiJ-like protein. This work illustrates the strength of sequence-based profile analysis techniques coupled with structural superpositions in developing an evolutionarily relevant classification of protein structures. Proteins 2001;42:230-236.


Asunto(s)
Antranilato Sintasa , Catalasa/química , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Escherichia coli/enzimología , Datos de Secuencia Molecular , Transferasas de Grupos Nitrogenados/química , Filogenia , Conformación Proteica , Homología de Secuencia de Aminoácido
11.
J Mol Biol ; 314(3): 365-74, 2001 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-11846551

RESUMEN

The O-linked GlcNAc transferases (OGTs) are a recently characterized group of largely eukaryotic enzymes that add a single beta-N-acetylglucosamine moiety to specific serine or threonine hydroxyls. In humans, this process may be part of a sugar regulation mechanism or cellular signaling pathway that is involved in many important diseases, such as diabetes, cancer, and neurodegeneration. However, no structural information about the human OGT exists, except for the identification of tetratricopeptide repeats (TPR) at the N terminus. The locations of substrate binding sites are unknown and the structural basis for this enzyme's function is not clear. Here, remote homology is reported between the OGTs and a large group of diverse sugar processing enzymes, including proteins with known structure such as glycogen phosphorylase, UDP-GlcNAc 2-epimerase, and the glycosyl transferase MurG. This relationship, in conjunction with amino acid similarity spanning the entire length of the sequence, implies that the fold of the human OGT consists of two Rossmann-like domains C-terminal to the TPR region. A conserved motif in the second Rossmann domain points to the UDP-GlcNAc donor binding site. This conclusion is supported by a combination of statistically significant PSI-BLAST hits, consensus secondary structure predictions, and a fold recognition hit to MurG. Additionally, iterative PSI-BLAST database searches reveal that proteins homologous to the OGTs form a large and diverse superfamily that is termed GPGTF (glycogen phosphorylase/glycosyl transferase). Up to one-third of the 51 functional families in the CAZY database, a glycosyl transferase classification scheme based on catalytic residue and sequence homology considerations, can be unified through this common predicted fold. GPGTF homologs constitute a substantial fraction of known proteins: 0.4% of all non-redundant sequences and about 1% of proteins in the Escherichia coli genome are found to belong to the GPGTF superfamily.


Asunto(s)
Proteínas de la Membrana Bacteriana Externa , Proteínas de Escherichia coli , Glucógeno Fosforilasa/química , N-Acetilglucosaminiltransferasas/química , Homología de Secuencia de Aminoácido , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Sitios de Unión , Carbohidrato Epimerasas/química , Carbohidrato Epimerasas/metabolismo , Biología Computacional , Secuencia Conservada , Bases de Datos de Proteínas , Glucógeno Fosforilasa/metabolismo , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Familia de Multigenes , N-Acetilglucosaminiltransferasas/metabolismo , Conformación Proteica , Pliegue de Proteína , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo , Alineación de Secuencia
12.
Science ; 290(5497): 1771-5, 2000 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-11099417

RESUMEN

In healthy individuals, acute changes in cholesterol intake produce modest changes in plasma cholesterol levels. A striking exception occurs in sitosterolemia, an autosomal recessive disorder characterized by increased intestinal absorption and decreased biliary excretion of dietary sterols, hypercholesterolemia, and premature coronary atherosclerosis. We identified seven different mutations in two adjacent, oppositely oriented genes that encode new members of the adenosine triphosphate (ATP)-binding cassette (ABC) transporter family (six mutations in ABCG8 and one in ABCG5) in nine patients with sitosterolemia. The two genes are expressed at highest levels in liver and intestine and, in mice, cholesterol feeding up-regulates expressions of both genes. These data suggest that ABCG5 and ABCG8 normally cooperate to limit intestinal absorption and to promote biliary excretion of sterols, and that mutated forms of these transporters predispose to sterol accumulation and atherosclerosis.


Asunto(s)
Transportadoras de Casetes de Unión a ATP/genética , Colesterol en la Dieta/metabolismo , Absorción Intestinal , Errores Innatos del Metabolismo Lipídico/genética , Lipoproteínas/genética , Sitoesteroles/sangre , Transportador de Casetes de Unión a ATP, Subfamilia G, Miembro 5 , Transportadoras de Casetes de Unión a ATP/química , Transportadoras de Casetes de Unión a ATP/metabolismo , Secuencia de Aminoácidos , Animales , Bilis/metabolismo , Colesterol/sangre , Colesterol en la Dieta/administración & dosificación , Mapeo Cromosómico , Cromosomas Humanos Par 2 , Codón , Proteínas de Unión al ADN , Etiquetas de Secuencia Expresada , Regulación de la Expresión Génica , Humanos , Mucosa Intestinal/metabolismo , Errores Innatos del Metabolismo Lipídico/metabolismo , Lipoproteínas/química , Lipoproteínas/metabolismo , Hígado/metabolismo , Receptores X del Hígado , Ratones , Ratones Endogámicos C57BL , Datos de Secuencia Molecular , Mutación , Receptores Nucleares Huérfanos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Receptores Citoplasmáticos y Nucleares/metabolismo , Sitoesteroles/metabolismo
13.
Proteins ; 41(2): 238-47, 2000 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-10966576

RESUMEN

Phosphotransacetylases of Escherichia coli and several other bacteria contain an additional 350-aa N-terminal fragment that is not required for phosphotransacetylase activity. Sequence analysis of this fragment revealed that it is closely related to a family of ATP-dependent enzymes that also includes dethiobiotin synthetase and the synthetase domains of two amidotransferases involved in cobalamin biosynthesis, cobyrinic acid a,c-diamide synthase (CobB) and cobyric acid synthase (CobQ). Further database searches showed that this enzyme family is also related to the MinD family of ATPases involved in regulation of cell division in bacteria and archaea. Analysis of sequence conservation in the members of this enzyme family using the structure of dethiobiotin synthetase active site as a guide allowed us to suggest a model for the interaction of CobB and CobQ with their respective substrates. CobB and CobQ were also found to contain unusual Triad family (class I) glutamine amidotransferase domains with conserved Cys and His residues, but lacking the Glu residue of the catalytic triad. These results should help in understanding the enzymology of cobalamin biosynthesis and in resolving the role of phosphotransacetylase in regulation of the carbon flow to and from acetate.


Asunto(s)
Ligasas de Carbono-Nitrógeno/química , Proteínas de Escherichia coli , Fosfato Acetiltransferasa/química , Transaminasas/química , Vitamina B 12/biosíntesis , Adenosina Trifosfatasas/química , Secuencia de Aminoácidos , Archaea/química , Bacterias/química , Secuencia Conservada , Modelos Moleculares , Datos de Secuencia Molecular , Unión Proteica , Conformación Proteica , Estructura Terciaria de Proteína , Alineación de Secuencia
14.
Genome Res ; 10(7): 991-1000, 2000 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-10899148

RESUMEN

Accumulation of complete genome sequences of diverse organisms creates new possibilities for evolutionary inferences from whole-genome comparisons. In the present study, we analyze the distributions of substitution rates among proteins encoded in 19 complete genomes (the interprotein rate distribution). To estimate these rates, it is necessary to employ another fundamental distribution, that of the substitution rates among sites in proteins (the intraprotein distribution). Using two independent approaches, we show that intraprotein substitution rate variability appears to be significantly greater than generally accepted. This yields more realistic estimates of evolutionary distances from amino-acid sequences, which is critical for evolutionary-tree construction. We demonstrate that the interprotein rate distributions inferred from the genome-to-genome comparisons are similar to each other and can be approximated by a single distribution with a long exponential shoulder. This suggests that a generalized version of the molecular clock hypothesis may be valid on genome scale. We also use the scaling parameter of the obtained interprotein rate distribution to construct a rooted whole-genome phylogeny. The topology of the resulting tree is largely compatible with those of global rRNA-based trees and trees produced by other approaches to genome-wide comparison.


Asunto(s)
Sustitución de Aminoácidos/genética , Variación Genética/genética , Genoma , Proteínas/genética , Biología Computacional , Evolución Molecular , Modelos Genéticos , Modelos Estadísticos , Filogenia , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos
15.
Nucleic Acids Res ; 28(14): 2643-50, 2000 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-10908318

RESUMEN

Helix-hairpin-helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein-protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude that most HhH motifs are integrated as a part of a five-helical domain, termed (HhH)(2) domain here. It typically consists of two consecutive HhH motifs that are linked by a connector helix and displays pseudo-2-fold symmetry. (HhH)(2) domains show clear structural integrity and a conserved hydrophobic core composed of seven residues, one residue from each alpha-helix and each hairpin, and deserves recognition as a distinct protein fold. In addition to known HhH in the structures of RuvA, RadA, MutY and DNA-polymerases, we have detected new HhH motifs in sterile alpha motif and barrier-to-autointegration factor domains, the alpha-subunit of Escherichia coli RNA-polymerase, DNA-helicase PcrA and DNA glycosylases. Statistically significant sequence similarity of HhH motifs and pronounced structural conservation argue for homology between (HhH)(2) domains in different protein families. Our analysis helps to clarify how non-symmetric protein motifs bind to the double helix of DNA through the formation of a pseudo-2-fold symmetric (HhH)(2) functional unit.


Asunto(s)
Proteínas de Unión al ADN/genética , Secuencias Hélice-Giro-Hélice/genética , Secuencia de Aminoácidos , Preescolar , ADN Glicosilasas , ADN Helicasas/química , ADN Helicasas/genética , ADN Polimerasa beta/química , ADN Polimerasa beta/genética , Proteínas de Unión al ADN/química , ARN Polimerasas Dirigidas por ADN/química , ARN Polimerasas Dirigidas por ADN/genética , Proteínas de Escherichia coli , Exonucleasas/química , Exonucleasas/genética , Humanos , Datos de Secuencia Molecular , N-Glicosil Hidrolasas/química , N-Glicosil Hidrolasas/genética , Estructura Terciaria de Proteína , Recombinasa Rad51 , Alineación de Secuencia , Homología de Secuencia de Aminoácido
16.
J Mol Biol ; 299(5): 1165-77, 2000 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-10873443

RESUMEN

Detection of remote evolutionary connections is increasingly difficult with sequence and structural divergence. A combination of sequence and structural analysis, in which statistically supported sequence similarity had a crucial impact, revealed that Escherichia coli topoisomerase I C-terminal fragment is evolutionarily related to the three tetracysteine zinc-binding domains of the enzyme. Spatial structure analysis of this C-terminal fragment indicates that it consists of two structurally similar domains and suggests homology between them. Sequence similarity between the zinc-binding domains of type Ia topoisomerases and transcription regulators of known spatial structure helps to conclude that E. coli topo I contains five copies of a zinc ribbon domain at the C terminus. Two of these domains, corresponding to the C-terminal fragment, lost their cysteine residues and are probably not able to bind zinc. Present analyses lead to the classification of the C-terminal fragment of E. coli topoisomerase I as a member of zinc ribbon superfamily, despite the absence of zinc-binding sites.


Asunto(s)
ADN-Topoisomerasas de Tipo I/química , ADN-Topoisomerasas de Tipo I/clasificación , Escherichia coli/enzimología , Zinc/metabolismo , Secuencia de Aminoácidos , Sitios de Unión , Cisteína/metabolismo , ADN-Topoisomerasas de Tipo I/metabolismo , Evolución Molecular , Modelos Moleculares , Datos de Secuencia Molecular , Fragmentos de Péptidos/química , Fragmentos de Péptidos/clasificación , Fragmentos de Péptidos/metabolismo , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Factores de Transcripción/química , Factores de Transcripción/metabolismo
17.
Nucleic Acids Res ; 28(11): 2229-33, 2000 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-10871343

RESUMEN

Many examples of enzymes that have lost their catalytic activity and perform other biological functions are known. The opposite situation is rare. A previously unnoticed structural similarity between the lambda integrase family (Int) proteins and the AraC family of transcriptional activators implies that the Int family evolved by duplication of an ancient DNA-binding homeodomain-like module, which acquired enzymatic activity. The two helix-turn-helix (HTH) motifs in Int proteins incorporate catalytic residues and participate in DNA binding. The active site of Int proteins, which include the type IB topoisomerases, is formed at the domain interface and the catalytic tyrosine residue is located in the second helix of the C-terminal HTH motif. Structural analysis of other 'tyrosine' DNA-breaking/rejoining enzymes with similar enzyme mechanisms, namely prokaryotic topoisomerase I, topoisomerase II and archaeal topoisomerase VI, reveals that the catalytic tyrosine is placed in a HTH domain as well. Surprisingly, the location of this tyrosine residue in the structure is not conserved, suggesting independent, parallel evolution leading to the same catalytic function by homologous HTH domains. The 'tyrosine' recombinases give a rare example of enzymes that evolved from ancient DNA-binding modules and present a unique case for homologous enzymatic domains with similar catalytic mechanisms but different locations of catalytic residues, which are placed at non-homologous sites.


Asunto(s)
Proteínas de Unión al ADN/genética , Secuencias Hélice-Giro-Hélice/genética , Integrasas/genética , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Bacteriófagos , Sitios de Unión , ADN Nucleotidiltransferasas/química , ADN Nucleotidiltransferasas/genética , ADN-Topoisomerasas de Tipo I/química , ADN-Topoisomerasas de Tipo I/genética , Proteínas de Unión al ADN/química , Escherichia coli , Evolución Molecular , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Integrasas/química , Modelos Moleculares , Datos de Secuencia Molecular , Recombinasas , Transactivadores/química , Transactivadores/genética , Tirosina/genética , Proteínas Virales/química , Proteínas Virales/genética
18.
J Mol Biol ; 299(4): 897-905, 2000 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-10843846

RESUMEN

Using the data on proteins encoded in complete genomes, combined with a rigorous theory of the sampling process, we estimate the total number of protein folds and families, as well as the number of folds and families in each genome. The total number of folds in globular, water- soluble proteins is estimated at about 1000, with structural information currently available for about one-third of the number. The sequenced genomes of unicellular organisms encode from approximately 25%, for the minimal genomes of the Mycoplasmas, to 70-80% for larger genomes, such as Escherichia coli and yeast, of the total number of folds. The number of protein families with significant sequence conservation was estimated to be between 4000 and 7000, with structures available for about 20% of these.


Asunto(s)
Secuencia Conservada , Genoma , Pliegue de Proteína , Proteínas/química , Proteínas/clasificación , Bases de Datos Factuales , Genoma Arqueal , Genoma Bacteriano , Genoma Fúngico , Estructura Terciaria de Proteína , Proteínas/metabolismo , Muestreo , Solubilidad , Distribuciones Estadísticas , Agua/metabolismo
19.
Proteins ; 40(1): 86-97, 2000 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-10813833

RESUMEN

Structural genomics of proteins of unknown function most straightforwardly assists with assignment of biochemical activity when the new structure resembles that of proteins whose functions are known. When a new fold is revealed, the universe of known folds is enriched, and once the function is determined by other means, novel structure-function relationships are established. The previously unannotated protein HI1434 from H. influenzae provides a hybrid example of these two paradigms. It is a member of a microbial protein family, labeled in SwissProt as YbaK and ebsC. The crystal structure at 1.8 A resolution reported here reveals a fold that is only remotely related to the C-lectin fold, in particular to endostatin, and thus is not sufficiently similar to imply that YbaK proteins are saccharide binding proteins. However, a crevice that may accommodate a small ligand is evident. The putative binding site contains only one invariant residue, Lys46, which carries a functional group that could play a role in catalysis, indicating that YbaK is probably not an enzyme. Detailed sequence analysis, including a number of newly sequenced microbial organisms, highlights sequence homology to an insertion domain in prolyl-tRNA synthetases (proRS) from prokaryote, a domain whose function is unknown. A HI1434-based model of the insertion domain shows that it should also contain the putative binding site. Being part of a tRNA synthetases, the insertion domain is likely to be involved in oligonucleotide binding, with possible roles in recognition/discrimination or editing of prolyl-tRNA. By analogy, YbaK may also play a role in nucleotide or oligonucleotide binding, the nature of which is yet to be determined.


Asunto(s)
Proteínas Bacterianas , Proteínas Portadoras/química , Haemophilus influenzae/química , Secuencia de Aminoácidos , Aminoacil-ARNt Sintetasas/química , Liasas de Carbono-Oxígeno , Proteínas Portadoras/aislamiento & purificación , Cristalografía por Rayos X , Lectinas/química , Modelos Moleculares , Datos de Secuencia Molecular , Pliegue de Proteína , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido
20.
Proc Natl Acad Sci U S A ; 97(10): 5123-8, 2000 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-10805775

RESUMEN

The NH(2)-terminal domains of membrane-bound sterol regulatory element-binding proteins (SREBPs) are released into the cytosol by regulated intramembrane proteolysis, after which they enter the nucleus to activate genes encoding lipid biosynthetic enzymes. Intramembrane proteolysis is catalyzed by Site-2 protease (S2P), a hydrophobic zinc metalloprotease that cleaves SREBPs at a membrane-embedded leucine-cysteine bond. In the current study, we use domain-swapping methods to localize the residues within the SREBP-2 membrane-spanning segment that are required for cleavage by S2P. The studies reveal a requirement for an asparagine-proline sequence in the middle third of the transmembrane segment. We propose a model in which the asparagine-proline sequence serves as an NH(2)-terminal cap for a portion of the transmembrane alpha-helix of SREBP, allowing the remainder of the alpha-helix to unwind partially to expose the peptide bond for cleavage by S2P.


Asunto(s)
Proteínas Potenciadoras de Unión a CCAAT , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , Dipéptidos , Endopeptidasas/metabolismo , Proteínas Nucleares/química , Proteínas Nucleares/metabolismo , Factores de Transcripción , Secuencia de Aminoácidos , Animales , Sitios de Unión , Línea Celular , Membrana Celular/metabolismo , Secuencia Conservada , Cricetinae , Secuencias Hélice-Asa-Hélice , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Secundaria de Proteína , Proteínas Recombinantes de Fusión/química , Proteínas Recombinantes de Fusión/metabolismo , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Proteína 1 de Unión a los Elementos Reguladores de Esteroles , Transfección
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA