Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
Adv Protein Chem Struct Biol ; 138: 135-178, 2024.
Article in English | MEDLINE | ID: mdl-38220423

ABSTRACT

The immunoglobulin fold (Ig fold) domain is a super-secondary structural motif consisting of a sandwich with two layers of ß-sheets that is present in many proteins with very diverse biological functions covering a wide range of physiological processes. This domain presents a modular architecture built with ß strands connected by variable length loops that has a highly conserved structural core of four ß-strands and quite variable ß-sheet extensions in the two sandwich layers that enable both divergent and convergent evolutionary mechanisms in the known Ig fold proteome. The central role of this Ig fold's structural plasticity in the evolutionary success of antibodies in our immune system is well established. Nature has also utilized this Ig fold in all domains of life in many different physiological contexts that go way beyond the immune system. Here we will present a structural and functional overview of the utilization of the Ig fold in different biological processes and in different cellular contexts to highlight some of the innumerable ways that this structural motif can interact in multidomain proteins to enable their diversity of functions. This includes shareable specific protein structure visualizations behind those functions that serve as starting points for further explorations of the biomolecular interactions spanning the Ig fold proteome. This overview also highlights how this Ig fold is being utilized through natural adaptation, engineering, and even building from scratch for a range of biotechnological applications.


Subject(s)
Protein Folding , Proteome , Antibodies
2.
Front Mol Biosci ; 9: 831740, 2022.
Article in English | MEDLINE | ID: mdl-35252351

ABSTRACT

iCn3D was initially developed as a web-based 3D molecular viewer. It then evolved from visualization into a full-featured interactive structural analysis software. It became a collaborative research instrument through the sharing of permanent, shortened URLs that encapsulate not only annotated visual molecular scenes, but also all underlying data and analysis scripts in a FAIR manner. More recently, with the growth of structural databases, the need to analyze large structural datasets systematically led us to use Python scripts and convert the code to be used in Node. js scripts. We showed a few examples of Python scripts at https://github.com/ncbi/icn3d/tree/master/icn3dpython to export secondary structures or PNG images from iCn3D. Users just need to replace the URL in the Python scripts to export other annotations from iCn3D. Furthermore, any interactive iCn3D feature can be converted into a Node. js script to be run in batch mode, enabling an interactive analysis performed on one or a handful of protein complexes to be scaled up to analysis features of large ensembles of structures. Currently available Node. js analysis scripts examples are available at https://github.com/ncbi/icn3d/tree/master/icn3dnode. This development will enable ensemble analyses on growing structural databases such as AlphaFold or RoseTTAFold on one hand and Electron Microscopy on the other. In this paper, we also review new features such as DelPhi electrostatic potential, 3D view of mutations, alignment of multiple chains, assembly of multiple structures by realignment, dynamic symmetry calculation, 2D cartoons at different levels, interactive contact maps, and use of iCn3D in Jupyter Notebook as described at https://pypi.org/project/icn3dpy.

3.
Methods Mol Biol ; 2112: 175-186, 2020.
Article in English | MEDLINE | ID: mdl-32006286

ABSTRACT

The VAST+ algorithm is an efficient, simple, and elegant solution to the problem of comparing the atomic structures of biological assemblies. Given two protein assemblies, it takes as input all the pairwise structural alignments of the component proteins. It then clusters the rotation matrices from the pairwise superpositions, with the clusters corresponding to subsets of the two assemblies that may be aligned and well superposed. It uses the Vector Alignment Search Tool (VAST) protein-protein comparison method for the input structural alignments, but other methods could be used, as well. From a chosen cluster, an "original" alignment for the assembly may be defined by simply combining the relevant input alignments. However, it is often useful to reduce/trim the original alignment, using a Monte Carlo refinement algorithm, which allows biologically relevant conformational differences to be more readily detected and observed. The method is easily extended to include RNA or DNA molecules. VAST+ results may be accessed via the URL https://www.ncbi.nlm.nih.gov/Structure , then entering a PDB accession or terms in the search box, and using the link [VAST+] in the upper right corner of the Structure Summary page.


Subject(s)
Proteins/chemistry , Sequence Alignment/methods , Algorithms , Databases, Protein , Monte Carlo Method , Protein Conformation , Search Engine/methods , Software
4.
Bioinformatics ; 36(1): 131-135, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31218344

ABSTRACT

MOTIVATION: Build a web-based 3D molecular structure viewer focusing on interactive structural analysis. RESULTS: iCn3D (I-see-in-3D) can simultaneously show 3D structure, 2D molecular contacts and 1D protein and nucleotide sequences through an integrated sequence/annotation browser. Pre-defined and arbitrary molecular features can be selected in any of the 1D/2D/3D windows as sets of residues and these selections are synchronized dynamically in all displays. Biological annotations such as protein domains, single nucleotide variations, etc. can be shown as tracks in the 1D sequence/annotation browser. These customized displays can be shared with colleagues or publishers via a simple URL. iCn3D can display structure-structure alignments obtained from NCBI's VAST+ service. It can also display the alignment of a sequence with a structure as identified by BLAST, and thus relate 3D structure to a large fraction of all known proteins. iCn3D can also display electron density maps or electron microscopy (EM) density maps, and export files for 3D printing. The following example URL exemplifies some of the 1D/2D/3D representations: https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?mmdbid=1TUP&showanno=1&show2d=1&showsets=1. AVAILABILITY AND IMPLEMENTATION: iCn3D is freely available to the public. Its source code is available at https://github.com/ncbi/icn3d. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Base Sequence , Computational Biology , Internet , Models, Molecular , Proteins , Software , Computational Biology/methods , Databases, Genetic , Molecular Conformation , Proteins/chemistry
5.
Nucleic Acids Res ; 42(Database issue): D297-303, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24319143

ABSTRACT

The computational detection of similarities between protein 3D structures has become an indispensable tool for the detection of homologous relationships, the classification of protein families and functional inference. Consequently, numerous algorithms have been developed that facilitate structure comparison, including rapid searches against a steadily growing collection of protein structures. To this end, NCBI's Molecular Modeling Database (MMDB), which is based on the Protein Data Bank (PDB), maintains a comprehensive and up-to-date archive of protein structure similarities computed with the Vector Alignment Search Tool (VAST). These similarities have been recorded on the level of single proteins and protein domains, comprising in excess of 1.5 billion pairwise alignments. Here we present VAST+, an extension to the existing VAST service, which summarizes and presents structural similarity on the level of biological assemblies or macromolecular complexes. VAST+ simplifies structure neighboring results and shows, for macromolecular complexes tracked in MMDB, lists of similar complexes ranked by the extent of similarity. VAST+ replaces the previous VAST service as the default presentation of structure neighboring data in NCBI's Entrez query and retrieval system. MMDB and VAST+ can be accessed via http://www.ncbi.nlm.nih.gov/Structure.


Subject(s)
Databases, Protein , Structural Homology, Protein , Computer Graphics , Internet , Macromolecular Substances/chemistry , Models, Molecular , Software
6.
Prog Mol Biol Transl Sci ; 117: 3-24, 2013.
Article in English | MEDLINE | ID: mdl-23663963

ABSTRACT

Protein homooligomers afford several important benefits for the cell; they mediate and regulate gene expression, activity of many enzymes, ion channels, receptors, and cell-cell adhesion processes. The evolutionary and physical mechanisms of oligomer formation are very diverse and are not well understood. Certain homooligomeric states may be conserved within protein subfamilies and between different subfamilies, therefore providing the specificity to particular substrates while minimizing interactions with unwanted partners. In addition, transitions between different oligomeric states may regulate protein activity and support the switch between different pathways. In this chapter, we summarize the biological importance of homooligomeric assemblies, physicochemical properties of their interfaces, experimental methods for their identification, their evolution, and role in human diseases.


Subject(s)
Evolution, Molecular , Protein Multimerization , Animals , Computational Biology , Disease , Humans , Models, Molecular , Protein Structure, Quaternary
7.
PLoS One ; 7(1): e28896, 2012.
Article in English | MEDLINE | ID: mdl-22303436

ABSTRACT

The coverage and reliability of protein-protein interactions determined by high-throughput experiments still needs to be improved, especially for higher organisms, therefore the question persists, how interactions can be verified and predicted by computational approaches using available data on protein structural complexes. Recently we developed an approach called IBIS (Inferred Biomolecular Interaction Server) to predict and annotate protein-protein binding sites and interaction partners, which is based on the assumption that the structural location and sequence patterns of protein-protein binding sites are conserved between close homologs. In this study first we confirmed high accuracy of our method and found that its accuracy depends critically on the usage of all available data on structures of homologous complexes, compared to the approaches where only a non-redundant set of complexes is employed. Second we showed that there exists a trade-off between specificity and sensitivity if we employ in the prediction only evolutionarily conserved binding site clusters or clusters supported by only one observation (singletons). Finally we addressed the question of identifying the biologically relevant interactions using the homology inference approach and demonstrated that a large majority of crystal packing interactions can be correctly identified and filtered by our algorithm. At the same time, about half of biological interfaces that are not present in the protein crystallographic asymmetric unit can be reconstructed by IBIS from homologous complexes without the prior knowledge of crystal parameters of the query protein.


Subject(s)
Conserved Sequence , Protein Interaction Mapping , Sequence Homology, Amino Acid , Algorithms , Amino Acid Sequence , Binding Sites , Clostridium/enzymology , Cluster Analysis , Crystallography, X-Ray , Databases, Protein , Molecular Sequence Data , Molybdoferredoxin/chemistry , Molybdoferredoxin/metabolism , Nitrogenase/metabolism , Protein Binding , Protein Structure, Secondary , Proteins/chemistry , Proteins/metabolism , Reproducibility of Results , Software
8.
J Mol Biol ; 415(2): 443-53, 2012 Jan 13.
Article in English | MEDLINE | ID: mdl-22198293

ABSTRACT

The modulation of protein-protein interactions (PPIs) by small drug-like molecules is a relatively new area of research and has opened up new opportunities in drug discovery. However, the progress made in this area is limited to a handful of known cases of small molecules that target specific diseases. With the increasing availability of protein structure complexes, it is highly important to devise strategies exploiting homologous structure space on a large scale for discovering putative PPIs that could be attractive drug targets. Here, we propose a scheme that allows performing large-scale screening of all protein complexes and finding putative small-molecule and/or peptide binding sites overlapping with protein-protein binding sites (so-called "multibinding sites"). We find more than 600 nonredundant proteins from 60 protein families with multibinding sites. Moreover, we show that the multibinding sites are mostly observed in transient complexes, largely overlap with the binding hotspots and are more evolutionarily conserved than other interface sites. We investigate possible mechanisms of how small molecules may modulate protein-protein binding and discuss examples of new candidates for drug design.


Subject(s)
Protein Interaction Domains and Motifs , Proteins/chemistry , Proteins/metabolism , Binding Sites , Computer Simulation , Drug Discovery/methods , Drug Evaluation, Preclinical/methods , Models, Molecular , Protein Binding
9.
Nucleic Acids Res ; 40(Database issue): D461-4, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22135289

ABSTRACT

Close to 60% of protein sequences tracked in comprehensive databases can be mapped to a known three-dimensional (3D) structure by standard sequence similarity searches. Potentially, a great deal can be learned about proteins or protein families of interest from considering 3D structure, and to this day 3D structure data may remain an underutilized resource. Here we present enhancements in the Molecular Modeling Database (MMDB) and its data presentation, specifically pertaining to biologically relevant complexes and molecular interactions. MMDB is tightly integrated with NCBI's Entrez search and retrieval system, and mirrors the contents of the Protein Data Bank. It links protein 3D structure data with sequence data, sequence classification resources and PubChem, a repository of small-molecule chemical structures and their biological activities, facilitating access to 3D structure data not only for structural biologists, but also for molecular biologists and chemists. MMDB provides a complete set of detailed and pre-computed structural alignments obtained with the VAST algorithm, and provides visualization tools for 3D structure and structure/sequence alignment via the molecular graphics viewer Cn3D. MMDB can be accessed at http://www.ncbi.nlm.nih.gov/structure.


Subject(s)
Databases, Protein , Models, Molecular , Protein Conformation , Sequence Analysis, Protein
10.
Nucleic Acids Res ; 40(Database issue): D834-40, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22102591

ABSTRACT

We have recently developed the Inferred Biomolecular Interaction Server (IBIS) and database, which reports, predicts and integrates different types of interaction partners and locations of binding sites in proteins based on the analysis of homologous structural complexes. Here, we highlight several new IBIS features and options. The server's webpage is now redesigned to allow users easier access to data for different interaction types. An entry page is added to give a quick summary of available results and to now accept protein sequence accessions. To elucidate the formation of protein complexes, not just binary interactions, IBIS currently presents an expandable interaction network. Previously, IBIS provided annotations for four different types of binding partners: proteins, small molecules, nucleic acids and peptides; in the current version a new protein-ion interaction type has been added. Several options provide easy downloads of IBIS data for all Protein Data Bank (PDB) protein chains and the results for each query. In this study, we show that about one-third of all RefSeq sequences can be annotated with IBIS interaction partners and binding sites. The IBIS server is available at http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi and updated biweekly.


Subject(s)
Databases, Protein , Protein Interaction Mapping , Proteins/chemistry , Binding Sites , Computer Graphics , Ions/chemistry , Molecular Sequence Annotation , Multiprotein Complexes/chemistry , Nucleic Acids/chemistry , Peptides/chemistry , Sequence Analysis, Protein , Systems Integration , User-Computer Interface
11.
BMC Bioinformatics ; 11: 365, 2010 Jul 01.
Article in English | MEDLINE | ID: mdl-20594344

ABSTRACT

BACKGROUND: The study of protein-small molecule interactions is vital for understanding protein function and for practical applications in drug discovery. To benefit from the rapidly increasing structural data, it is essential to improve the tools that enable large scale binding site prediction with greater emphasis on their biological validity. RESULTS: We have developed a new method for the annotation of protein-small molecule binding sites, using inference by homology, which allows us to extend annotation onto protein sequences without experimental data available. To ensure biological relevance of binding sites, our method clusters similar binding sites found in homologous protein structures based on their sequence and structure conservation. Binding sites which appear evolutionarily conserved among non-redundant sets of homologous proteins are given higher priority. After binding sites are clustered, position specific score matrices (PSSMs) are constructed from the corresponding binding site alignments. Together with other measures, the PSSMs are subsequently used to rank binding sites to assess how well they match the query and to better gauge their biological relevance. The method also facilitates a succinct and informative representation of observed and inferred binding sites from homologs with known three-dimensional structures, thereby providing the means to analyze conservation and diversity of binding modes. Furthermore, the chemical properties of small molecules bound to the inferred binding sites can be used as a starting point in small molecule virtual screening. The method was validated by comparison to other binding site prediction methods and to a collection of manually curated binding site annotations. We show that our method achieves a sensitivity of 72% at predicting biologically relevant binding sites and can accurately discriminate those sites that bind biological small molecules from non-biological ones. CONCLUSIONS: A new algorithm has been developed to predict binding sites with high accuracy in terms of their biological validity. It also provides a common platform for function prediction, knowledge-based docking and for small molecule virtual screening. The method can be applied even for a query sequence without structure. The method is available at http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi.


Subject(s)
Algorithms , Binding Sites , Proteins/chemistry , Proteins/metabolism , Amino Acid Sequence , Cluster Analysis , Knowledge Bases , Protein Binding , Sequence Analysis, Protein , Structural Homology, Protein
12.
J Mol Biol ; 399(1): 196-206, 2010 May 28.
Article in English | MEDLINE | ID: mdl-20381499

ABSTRACT

Glycosylation is an important aspect of epigenetic regulation. Glycosyltransferase is a key enzyme in the biosynthesis of glycans, which glycosylates more than half of all proteins in eukaryotes and is involved in a wide range of biological processes. It has been suggested previously that homooligomerization in glycosyltransferases and other proteins might be crucial for their function. In this study, we explore functional homooligomeric states of glycosyltransferases in various organisms, trace their evolution, and perform comparative analyses to find structural features that can mediate or disrupt the formation of different homooligomers. First, we make a structure-based classification of the diverse superfamily of glycosyltransferases and confirm that the majority of the structures are indeed clustered into the GT-A or GT-B folds. We find that homooligomeric glycosyltransferases appear to be as ancient as monomeric glycosyltransferases and go back in evolution to the last universal common ancestor (LUCA). Moreover, we show that interface residues have significant bias to be gapped out or unaligned in the monomers, implying that they might represent features crucial for oligomer formation. Structural analysis of these features reveals that the majority of them represent loops, terminal regions, and helices, indicating that these secondary-structure elements mediate the formation of glycosyltransferases' homooligomers and directly contribute to the specific binding. We also observe relatively short protein regions that disrupt the homodimer interactions, although such cases are rare. These results suggest that relatively small structural changes in the nonconserved regions may contribute to the formation of different functional oligomeric states and might be important in regulation of enzyme activity through homooligomerization.


Subject(s)
Evolution, Molecular , Glycosyltransferases/chemistry , Amino Acid Sequence , Databases, Protein , Glycosyltransferases/genetics , Glycosyltransferases/metabolism , Molecular Sequence Data , Phylogeny , Sequence Alignment
13.
Nucleic Acids Res ; 38(Database issue): D518-24, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19843613

ABSTRACT

IBIS is the NCBI Inferred Biomolecular Interaction Server. This server organizes, analyzes and predicts interaction partners and locations of binding sites in proteins. IBIS provides annotations for different types of binding partners (protein, chemical, nucleic acid and peptides), and facilitates the mapping of a comprehensive biomolecular interaction network for a given protein query. IBIS reports interactions observed in experimentally determined structural complexes of a given protein, and at the same time IBIS infers binding sites/interacting partners by inspecting protein complexes formed by homologous proteins. Similar binding sites are clustered together based on their sequence and structure conservation. To emphasize biologically relevant binding sites, several algorithms are used for verification in terms of evolutionary conservation, biological importance of binding partners, size and stability of interfaces, as well as evidence from the published literature. IBIS is updated regularly and is freely accessible via http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.html.


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Protein , Protein Interaction Mapping/methods , Protein Structure, Tertiary , Algorithms , Animals , Binding Sites , Catalytic Domain , Cluster Analysis , Computational Biology/trends , Humans , Information Storage and Retrieval/methods , Internet , Protein-Tyrosine Kinases/chemistry , Software
14.
Biophys J ; 96(6): 2178-88, 2009 Mar 18.
Article in English | MEDLINE | ID: mdl-19289044

ABSTRACT

A large set of three-dimensional structures of 264 protein-protein complexes with known nonsynonymous single nucleotide polymorphisms (nsSNPs) at the interface was built using homology-based methods. The nsSNPs were mapped on the proteins' structures and their effect on the binding energy was investigated with CHARMM force field and continuum electrostatic calculations. Two sets of nsSNPs were studied: disease annotated Online Mendelian Inheritance in Man (OMIM) and nonannotated (non-OMIM). It was demonstrated that OMIM nsSNPs tend to destabilize the electrostatic component of the binding energy, in contrast with the effect of non-OMIM nsSNPs. In addition, it was shown that the change of the binding energy upon amino acid substitutions is not related to the conservation of the net charge, hydrophobicity, or hydrogen bond network at the interface. The results indicate that, generally, the effect of nsSNPs on protein-protein interactions cannot be predicted from amino acids' physico-chemical properties alone, since in many cases a substitution of a particular residue with another amino acid having completely different polarity or hydrophobicity had little effect on the binding energy. Analysis of sequence conservation showed that nsSNP at highly conserved positions resulted in a large variance of the binding energy changes. In contrast, amino acid substitutions corresponding to nsSNPs at nonconserved positions, on average, were not found to have a large effect on binding affinity. pKa calculations were performed and showed that amino acid substitutions could change the wild-type proton uptake/release and thus resulting in different pH-dependence of the binding energy.


Subject(s)
Models, Molecular , Polymorphism, Single Nucleotide , Proteins/chemistry , Algorithms , Amino Acid Sequence , Amino Acid Substitution , Conserved Sequence , Databases, Genetic , Glutathione Transferase/chemistry , Humans , Hydrophobic and Hydrophilic Interactions , Molecular Sequence Data , Protein Binding , Protein Conformation , Protein Interaction Mapping , Static Electricity , beta-Globins/chemistry
15.
BMC Struct Biol ; 7: 23, 2007 Apr 10.
Article in English | MEDLINE | ID: mdl-17425794

ABSTRACT

BACKGROUND: To discover remote evolutionary relationships and functional similarities between proteins, biologists rely on comparative sequence analysis, and when structures are available, on structural alignments and various measures of structural similarity. The measures/scores that have most commonly been used for this purpose include: alignment length, percent sequence identity, superposition RMSD and their different combinations. More recently, we have introduced the "Homologous core structure overlap score" (HCS) and the "Loop Hausdorff Measure" (LHM). Along with these we also consider the "gapped structural alignment score" (GSAS), which was introduced earlier by other researchers. RESULTS: We analyze the performance of these and other conventional measures at the task of ranking structure neighbors by homology, and we show that the HCS, LHM, and GSAS scores display considerably improved performance over the conventional measures of sequence or structural similarity. CONCLUSION: The HCS, LHM, and GSAS scores are easily computable quantities that allow users of structure-neighbor databases to more easily identify interesting structural similarities between proteins.


Subject(s)
Evolution, Molecular , Phylogeny , Proteins/chemistry , Structural Homology, Protein , Algorithms , Amino Acid Sequence , Conserved Sequence , Databases, Protein , Protein Structure, Secondary , Sequence Alignment
16.
BMC Evol Biol ; 7: 19, 2007 Feb 13.
Article in English | MEDLINE | ID: mdl-17298668

ABSTRACT

BACKGROUND: In this paper we describe an analysis of the size evolution of both protein domains and their indels, as inferred by changing sizes of whole domains or individual unaligned regions or "spacers". We studied relatively early evolutionary events and focused on protein domains which are conserved among various taxonomy groups. RESULTS: We found that more than one third of all domains have a statistically significant tendency to increase/decrease in size in evolution as judged from the overall domain size distribution as well as from the size distribution of individual spacers. Moreover, the fraction of domains and individual spacers increasing in size is almost twofold larger than the fraction decreasing in size. CONCLUSION: We showed that the tolerance to insertion and deletion events depends on the domain's taxonomy span. Eukaryotic domains are depleted in insertions compared to the overall test set, namely, the number of spacers increasing in size is about the same as the number of spacers decreasing in size. On the other hand, ancient domain families show some bias towards insertions or spacers which grow in size in evolution. Domains from several Gene Ontology categories also demonstrate certain tendencies for insertion or deletion events as inferred from the analysis of spacer sizes.


Subject(s)
Databases, Protein , Evolution, Molecular , Protein Structure, Tertiary/genetics , Animals , Computational Biology , Humans , Sequence Alignment , Sequence Deletion
17.
Nucleic Acids Res ; 35(Database issue): D298-300, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17135201

ABSTRACT

Three-dimensional (3D) structure is now known for a large fraction of all protein families. Thus, it has become rather likely that one will find a homolog with known 3D structure when searching a sequence database with an arbitrary query sequence. Depending on the extent of similarity, such neighbor relationships may allow one to infer biological function and to identify functional sites such as binding motifs or catalytic centers. Entrez's 3D-structure database, the Molecular Modeling Database (MMDB), provides easy access to the richness of 3D structure data and its large potential for functional annotation. Entrez's search engine offers several tools to assist biologist users: (i) links between databases, such as between protein sequences and structures, (ii) pre-computed sequence and structure neighbors, (iii) visualization of structure and sequence/structure alignment. Here, we describe an annotation service that combines some of these tools automatically, Entrez's 'Related Structure' links. For all proteins in Entrez, similar sequences with known 3D structure are detected by BLAST and alignments are recorded. The 'Related Structure' service summarizes this information and presents 3D views mapping sequence residues onto all 3D structures available in MMDB (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=structure).


Subject(s)
Databases, Protein , Models, Molecular , Protein Conformation , Sequence Analysis, Protein , Internet , Sequence Alignment , User-Computer Interface
18.
Proteins ; 61(3): 535-44, 2005 Nov 15.
Article in English | MEDLINE | ID: mdl-16184609

ABSTRACT

In this work we examine how protein structural changes are coupled with sequence variation in the course of evolution of a family of homologs. The sequence-structure correlation analysis performed on 81 homologous protein families shows that the majority of them exhibit statistically significant linear correlation between the measures of sequence and structural similarity. We observed, however, that there are cases where structural variability cannot be mainly explained by sequence variation, such as protein families with a number of disulfide bonds. To understand whether structures from different families and/or folds evolve in the same manner, we compared the degrees of structural change per unit of sequence change ("the evolutionary plasticity of structure") between those families with a significant linear correlation. Using rigorous statistical procedures we find that, with a few exceptions, evolutionary plasticity does not show a statistically significant difference between protein families. Similar sequence-structure analysis performed for protein loop regions shows that evolutionary plasticity of loop regions is greater than for the protein core.


Subject(s)
Evolution, Molecular , Proteins/chemistry , Proteins/classification , Structural Homology, Protein , Amino Acids , Disulfides , Protein Structure, Secondary , Regression Analysis , Sequence Alignment
19.
BMC Evol Biol ; 5: 10, 2005 Feb 03.
Article in English | MEDLINE | ID: mdl-15691378

ABSTRACT

BACKGROUND: Protein evolution and protein classification are usually inferred by comparing protein cores in their conserved aligned parts. Structurally aligned protein regions are separated by less conserved loop regions, where sequence and structure locally deviate from each other and do not superimpose well. RESULTS: Our results indicate that even longer protein loops can not be viewed as "random coils" and for the majority of protein families in our test set there exists a linear correlation between the measures of sequence similarity and loop structural similarity. Results suggest that distance matrices derived from the loop (dis)similarity measure may produce in some cases more reliable cluster trees compared to the distance matrices based on the conventional measures of sequence and structural (dis)similarity. CONCLUSIONS: We show that by considering "dissimilar" loop regions rather than only conserved core regions it is possible to improve our understanding of protein evolution.


Subject(s)
Evolution, Molecular , Algorithms , Animals , Biological Evolution , Cluster Analysis , Gene Deletion , Genetic Linkage , Genomics/methods , Models, Genetic , Models, Molecular , Models, Statistical , Mutation , Phylogeny , Protein Conformation , Protein Structure, Tertiary , Proteins/chemistry , Ribonuclease, Pancreatic/chemistry , Ribonuclease, Pancreatic/genetics , Sequence Alignment
20.
Proteins ; 57(3): 539-47, 2004 Nov 15.
Article in English | MEDLINE | ID: mdl-15382231

ABSTRACT

Two proteins are considered to have a similar fold if sufficiently many of their secondary structure elements are positioned similarly in space and are connected in the same order. Such a common structural scaffold may arise due to either divergent or convergent evolution. The intervening unaligned regions ("loops") between the superimposable helices and strands can exhibit a wide range of similarity and may offer clues to the structural evolution of folds. One might argue that more closely related proteins differ less in their nonconserved loop regions than distantly related proteins and, at the same time, the degree of variability in the loop regions in structurally similar but unrelated proteins is higher than in homologs. Here we introduce a new measure for structural (dis)similarity in loop regions that is based on the concept of the Hausdorff metric. This measure is used to gauge protein relatedness and is tested on a benchmark of homologous and analogous protein structures. It has been shown that the new measure can distinguish homologous from analogous proteins with the same or higher accuracy than the conventional measures that are based on comparing proteins in structurally aligned regions. We argue that this result can be attributed to the higher sensitivity of the Hausdorff (dis)similarity measure in detecting particularly evident dissimilarities in structures and draw some conclusions about evolutionary relatedness of proteins in the most populated protein folds.


Subject(s)
Computational Biology/methods , Proteins/chemistry , Structural Homology, Protein , Evolution, Molecular , Methyltransferases/chemistry , Methyltransferases/metabolism , Models, Molecular , Oxidoreductases/chemistry , Protein Folding , Protein Structure, Tertiary , Proteins/metabolism , S-Adenosylmethionine/metabolism , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...