Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
Add more filters










Publication year range
1.
Bioinformatics ; 27(7): 925-32, 2011 Apr 01.
Article in English | MEDLINE | ID: mdl-21296751

ABSTRACT

MOTIVATION: The database of known protein structures (PDB) is increasing rapidly. This results in a growing need for methods that can cope with the vast amount of structural data. To analyze the accumulating data, it is important to have a fast tool for identifying similar structures and clustering them by structural resemblance. Several excellent tools have been developed for the comparison of protein structures. These usually address the task of local structure alignment, an important yet computationally intensive problem due to its complexity. It is difficult to use such tools for comparing a large number of structures to each other at a reasonable time. RESULTS: Here we present GOSSIP, a novel method for a global all-against-all alignment of any set of protein structures. The method detects similarities between structures down to a certain cutoff (a parameter of the program), hence allowing it to detect similar structures at a much higher speed than local structure alignment methods. GOSSIP compares many structures in times which are several orders of magnitude faster than well-known available structure alignment servers, and it is also faster than a database scanning method. We evaluate GOSSIP both on a dataset of short structural fragments and on two large sequence-diverse structural benchmarks. Our conclusions are that for a threshold of 0.6 and above, the speed of GOSSIP is obtained with no compromise of the accuracy of the alignments or of the number of detected global similarities. AVAILABILITY: A server, as well as an executable for download, are available at http://bioinfo3d.cs.tau.ac.il/gossip/.


Subject(s)
Software , Structural Homology, Protein , Cluster Analysis , Databases, Protein , Proteins/chemistry , Sequence Alignment
2.
J Comput Biol ; 8(2): 93-121, 2001.
Article in English | MEDLINE | ID: mdl-11454300

ABSTRACT

Here we present an algorithm designed to carry out multiple structure alignment and to detect recurring substructural motifs. So far we have implemented it for comparison of protein structures. However, this general method is applicable to comparisons of RNA structures and to detection of a pharmacophore in a series of drug molecules. Further, its sequence order independence permits its application to detection of motifs on protein surfaces, interfaces, and binding/active sites. While there are many methods designed to carry out pairwise structure comparisons, there are only a handful geared toward the multiple structure alignment task. Most of these tackle multiple structure comparison as a collection of pairwise structure comparison tasks. The multiple structural alignment algorithm presented here automatically finds the largest common substructure (core) of atoms that appears in all the molecules in the ensemble. The detection of the core and the structural alignment are done simultaneously. The algorithm begins by finding small substructures that are common to all the proteins in the ensemble. One of the molecules is considered the reference; the others are the source molecules. The small substructures are stored in special arrays termed combinatorial buckets, which define sets of multistructural alignments from the source molecules that coincide with the same small set of reference atoms (C(alpha)-atoms here). These substructures are initial small fragments that have congruent copies in each of the proteins. The substructures are extended, through the processing of the combinatorial buckets, by clustering the superpositions (transformations). The method is very efficient.


Subject(s)
Algorithms , Proteins/chemistry , Sequence Alignment/methods , Binding Sites , Calcium/metabolism , Globins/chemistry , Protein Conformation , Protein Folding , Proteins/metabolism , Serpins/chemistry
3.
Curr Opin Struct Biol ; 11(3): 364-9, 2001 Jun.
Article in English | MEDLINE | ID: mdl-11406388

ABSTRACT

Recent studies increasingly point to the importance of structural flexibility and plasticity in proteins, highlighting the evolutionary advantage. There are an increasing number of cases in which given, presumably specific, binding sites have been shown to bind a range of ligands with different compositions and shapes. These studies have also revealed that evolution tends to find convergent solutions for stable intermolecular associations, largely via conservation of polar residues as hot spots of binding energy. On the other hand, the ability to bind multiple ligands at a given site is largely derived from hinge-based motions. The consideration of these two factors in functional epitopes allows more realism and robustness in the description of protein binding surfaces and, as such, in applications to mutants, modeled structures and design. Efficient multiple structure comparison and hinge-bending structure comparison tools enable the construction of combinatorial binding epitope libraries.


Subject(s)
Combinatorial Chemistry Techniques , Epitopes , Proteins/chemistry , Proteins/immunology , Evolution, Molecular , Protein Conformation , Proteins/metabolism
4.
Proteins ; 43(3): 235-45, 2001 May 15.
Article in English | MEDLINE | ID: mdl-11288173

ABSTRACT

While a number of approaches have been geared toward multiple sequence alignments, to date there have been very few approaches to multiple structure alignment and detection of a recurring substructural motif. Among these, none performs both multiple structure comparison and motif detection simultaneously. Further, none considers all structures at the same time, rather than initiating from pairwise molecular comparisons. We present such a multiple structural alignment algorithm. Given an ensemble of protein structures, the algorithm automatically finds the largest common substructure (core) of C(alpha) atoms that appears in all the molecules in the ensemble. The detection of the core and the structural alignment are done simultaneously. Additional structural alignments also are obtained and are ranked by the sizes of the substructural motifs, which are present in the entire ensemble. The method is based on the geometric hashing paradigm. As in our previous structural comparison algorithms, it compares the structures in an amino acid sequence order-independent way, and hence the resulting alignment is unaffected by insertions, deletions and protein chain directionality. As such, it can be applied to protein surfaces, protein-protein interfaces and protein cores to find the optimally, and suboptimally spatially recurring substructural motifs. There is no predefinition of the motif. We describe the algorithm, demonstrating its efficiency. In particular, we present a range of results for several protein ensembles, with different folds and belonging to the same, or to different, families. Since the algorithm treats molecules as collections of points in three-dimensional space, it can also be applied to other molecules, such as RNA, or drugs.


Subject(s)
Algorithms , Protein Conformation , Proteins/chemistry , Amino Acid Motifs , Automation , Globins/chemistry , Triose-Phosphate Isomerase/chemistry
6.
Article in English | MEDLINE | ID: mdl-10977094

ABSTRACT

We present two algorithms which align flexible protein structures. Both apply efficient structural pattern detection and graph theoretic techniques. The FlexProt algorithm simultaneously detects the hinge regions and aligns the rigid subparts of the molecules. It does it by efficiently detecting maximal congruent rigid fragments in both molecules and calculating their optimal arrangement which does not violate the protein sequence order. The FlexMol algorithm is sequence order independent, yet requires as input the hypothesized hinge positions. Due its sequence order independence it can also be applied to protein-protein interface matching and drug molecule alignment. It aligns the rigid parts of the molecule using the Geometric Hashing method and calculates optimal connectivity among these parts by graph-theoretic techniques. Both algorithms are highly efficient even compared with rigid structure alignment algorithms. Typical running times on a standard desktop PC (400 MHz) are about 7 seconds for FlexProt and about 1 minute for FlexMol.


Subject(s)
Algorithms , Proteins , Sequence Alignment/methods , Animals , Humans , Protein Conformation , Proteins/analysis , Proteins/chemistry , Proteins/genetics , Sequence Analysis, Protein/methods
7.
Comb Chem High Throughput Screen ; 2(5): 249-59, 1999 Oct.
Article in English | MEDLINE | ID: mdl-10539986

ABSTRACT

In this, and the next review article (1), we present highly efficient, computer-vision and robotics based algorithms for docking and for the generation and matching of epitopes on molecular surfaces. We start with descriptions of molecular surfaces, and proceed to utilize these in both rigid-body and flexible matching routines. These algorithms originate in the computer vision and robotics disciplines. Frequently used approaches, both in searches for molecular similarity and for docking, i.e., molecular complementarity, strive to obtain highly accurate correspondence of respective molecular surfaces. However, owing to molecular surface variability in solution, to mutational events, and to the need to use modeled structures in addition to high resolution ones, utilization of epitopes might prove to be a judicious approach to follow. Furthermore, through the deployment of libraries of epitopes which represent recurring features, or motifs in a given family of receptors or of enzymes, in principle we a priori focus on the more critical groups of atoms, or amino acids, essential for the binding of the two molecules. Utilization of recurring motifs may prove more robust than single molecule matchings. In addition, via utilization of epitopes one can make use of information derived from evolutionary related molecules. All of the above combine to represent an approach which may be highly advantageous. Combinatorial approaches have proven their immense utility in the wet laboratory. The combination of efficient computational approaches and the utilization of such libraries may well be particularly profitable. Our highly efficient techniques are amenable to such a task. In this review we focus on rigid and flexible docking algorithms. In the second review (1) we address the generation of epitopes in families of molecules. These may be used by the docking algorithms to identify the more likely bound interfaces.


Subject(s)
Algorithms , Combinatorial Chemistry Techniques , Binding Sites , Computer Simulation , Epitopes , Ligands , Molecular Conformation , Pharmaceutical Preparations , Protein Binding , Robotics
8.
Comb Chem High Throughput Screen ; 2(5): 261-9, 1999 Oct.
Article in English | MEDLINE | ID: mdl-10539987

ABSTRACT

This is the second review in a two-part series. In the first review (1) we described the computational complexity involved in the docking of a ligand onto a receptor surface. In particular, we focused on efficient algorithms designed to handle this computational task. Such a procedure results in a large number of potential, geometrically feasible solutions. The difficulty is to pinpoint which of these is the more likely candidate. While there exists a number of approaches to rank these solutions according to different criteria, such as the size of the interface or some approximation of their binding energetics, none of the existing methods has been shown to be consistently successful in this endeavor. If the binding site is unknown a priori, the magnitude of the task is awesome. Here we propose one way of addressing this problem, i.e., via derivation and utilization of binding epitopes. If a library of such epitopes is available, particularly for a large number of protein families, it may be used to predict more likely binding sites for a given ligand. We describe an efficient, computer-vision based method to construct binding epitopes focusing on two ways through which such a library can be generated, (i) molecular surface-based, or (ii) residue-based. Alternatively, the two can be combined. We further describe how such a library may be used efficiently in the matching/docking procedure.


Subject(s)
Algorithms , Epitopes , Protein Conformation , Binding Sites , Computer Simulation , Protein Binding
9.
Comb Chem High Throughput Screen ; 2(4): 223-37, 1999 Aug.
Article in English | MEDLINE | ID: mdl-10469882

ABSTRACT

Here we examine the recognition of small molecules by their protein and DNA receptors. We focus on two questions: First, how well does the solid angle molecular surface representation perform in fitting together the surfaces of small ligands, such as drugs and cofactors to their corresponding receptors; And second, in particular, to what extent does the shape complementarity play a role in the matching (recognition) process of such small molecules. Both questions have been investigated in protein-protein binding: "Critical Points" based on solid angle calculations have been shown to perform well in the matching of large protein molecules. They are robust, may be few in numbers, and capture satisfactorily the molecular shape. Shape complementarity has been shown to be a critical factor in protein-protein recognition, but has not been examined in drug-receptor recognition. To probe these questions, here we dock 185 receptor-small ligand molecule pairs. We find that such a representation performs adequately for the smaller ligands too, and that shape complementarity is also observed. These issues are important, given the large databases of drugs that routinely have to be scanned to find candidate, lead compounds. We have been able to carry out such large scale docking experiments owing to our efficient, computer-vision based docking algorithms. Its fast CPU matching times, on the order of minutes on a PC, allows such large scale docking experiments.


Subject(s)
Drug Design , Ligands , DNA/chemistry , DNA/metabolism , Models, Molecular , Molecular Conformation , Nucleic Acid Conformation , Pharmaceutical Preparations/chemistry , Protein Conformation , Proteins/chemistry , Proteins/metabolism
10.
Proteins ; 36(3): 307-17, 1999 Aug 15.
Article in English | MEDLINE | ID: mdl-10409824

ABSTRACT

Here we carry out an examination of shape complementarity as a criterion in protein-protein docking and binding. Specifically, we examine the quality of shape complementarity as a critical determinant not only in the docking of 26 protein-protein "bound" complexed cases, but in particular, of 19 "unbound" protein-protein cases, where the structures have been determined separately. In all cases, entire molecular surfaces are utilized in the docking, with no consideration of the location of the active site, or of particular residues/atoms in either the receptor or the ligand that participate in the binding. To evaluate the goodness of the strictly geometry-based shape complementarity in the docking process as compared to the main favorable and unfavorable energy components, we study systematically a potential correlation between each of these components and the root mean square deviation (RMSD) of the "unbound" protein-protein cases. Specifically, we examine the non-polar buried surface area, polar buried surface area, buried surface area relating to groups bearing unsatisfied buried charges, and the number of hydrogen bonds in all docked protein-protein interfaces. For these cases, where the two proteins have been crystallized separately, and where entire molecular surfaces are considered without a predefinition of the binding site, no correlation is observed. None of these parameters appears to consistently improve on shape complementarity in the docking of unbound molecules. These findings argue that simplicity in the docking process, utilizing geometrical shape criteria may capture many of the essential features in protein-protein docking. In particular, they further reinforce the long held notion of the importance of molecular surface shape complementarity in the binding, and hence in docking. This is particularly interesting in light of the fact that the structures of the docked pairs have been determined separately, allowing side chains on the surface of the proteins to move relatively freely. This study has been enabled by our efficient, computer vision-based docking algorithms. The fast CPU matching times, on the order of minutes on a PC, allow such large-scale docking experiments of large molecules, which may not be feasible by other techniques. Proteins 1999;36:307-317.


Subject(s)
Proteins/chemistry , Proteins/metabolism , Algorithms , Binding Sites , Hydrogen Bonding , Ligands , Models, Molecular , Protein Binding , Protein Conformation , Surface Properties , Thermodynamics
11.
Article in English | MEDLINE | ID: mdl-10786299

ABSTRACT

A Multiple Structural Alignment algorithm is presented. The algorithm accepts an ensemble of protein structures and finds the largest substructure (core) of C alpha atoms whose geometric configuration appear in all the molecules of the ensemble (core). Both the detection of this core and the resulting structural alignment are done simultaneously. Other large enough multistructural superimpositions are detected as well. Our method is based on the Geometric Hashing paradigm and a superimposition clustering technique which represents superimpositions by sets of matching atoms. The algorithm proved to be efficient on real data in a series of experiments. The same method can be applied to any ensemble of molecules (not necessarily proteins) since our basic technique is sequence order independent.


Subject(s)
Models, Theoretical , Molecular Structure , Algorithms , Cluster Analysis , Databases, Factual , Models, Molecular , Sequence Analysis, Protein/methods , Software
12.
Proteins ; 32(2): 159-74, 1998 Aug 01.
Article in English | MEDLINE | ID: mdl-9714156

ABSTRACT

Here we dock a ligand onto a receptor surface allowing hinge-bending domain/substructural movements. Our approach mimics and manifests induced fit in molecular recognition. All angular rotations are allowed on the one hand, while a conformational space search is avoided on the other. Rather than dock each of the molecular parts separately with subsequent reconstruction of the consistently docked molecules, all parts are docked simultaneously while still utilizing the position of the hinge from the start. Like pliers closing on a screw, the receptor automatically closes on its ligand in the best surface-matching way. Movements are allowed either in the ligand or in the larger receptor, hence reproducing induced molecular fit. Hinge bending movements are frequently observed when molecules associate. There are numerous examples of open versus closed conformations taking place upon binding. Such movements are observed when the substrate binds to its respective enzyme. In particular, such movements are of interest in allosteric enzymes. The movements can involve entire domains, subdomains, loops, (other) secondary structure elements, or between any groups of atoms connected by flexible joints. We have implemented the hinges at points and at bonds. By allowing 3-dimensional (3-D) rotation at the hinge, several rotations about (consecutive or nearby) bonds are implicitly taken into account. Alternatively, if required, the point rotation can be restricted to bond rotation. Here we illustrate this hinge-bending docking approach and the insight into flexibility it provides on a complex of the calmodulin with its M13 ligand, positioning the hinges either in the ligand or in the larger receptor. This automated and efficient method is adapted from computer vision and robotics. It enables utilizing entire molecular surfaces rather than focusing a priori on active sites. Hence, allows attaining the overall optimally matching surfaces, the extent and type of motions which are involved. Here we do not treat the conformational flexibility of side-chains or of very small pieces of the molecules. Therefore, currently available methods addressing these issues and the method presented here, are complementary to each other, expanding the repertoire of computational docking tools foreseen to aid in studies of recognition, conformational flexibility and drug design.


Subject(s)
Algorithms , Calmodulin/chemistry , Calmodulin/metabolism , Protein Conformation , Allosteric Regulation , Allosteric Site , Computer Simulation , Drug Design , Ligands , Models, Chemical , Models, Molecular , Myosin-Light-Chain Kinase/chemistry , Myosin-Light-Chain Kinase/metabolism , Peptide Fragments/chemistry , Peptide Fragments/metabolism
13.
J Comput Biol ; 5(4): 631-54, 1998.
Article in English | MEDLINE | ID: mdl-10072081

ABSTRACT

In this work, we present an algorithm developed to handle biomolecular structural recognition problems, as part of an interdisciplinary research endeavor of the Computer Vision and Molecular Biology fields. A key problem in rational drug design and in biomolecular structural recognition is the generation of binding modes between two molecules, also known as molecular docking. Geometrical fitness is a necessary condition for molecular interaction. Hence, docking a ligand (e.g., a drug molecule or a protein molecule), to a protein receptor (e.g., enzyme), involves recognition of molecular surfaces. Conformational transitions by "hinge-bending" involves rotational movements of relatively rigid parts with respect to each other. The generation of docked binding modes between two associating molecules depends on their three dimensional structures (3-D) and their conformational flexibility. In comparison to the particular case of rigid-body docking, the computational difficulty grows considerably when taking into account the additional degrees of freedom intrinsic to the flexible molecular docking problem. Previous docking techniques have enabled hinge movements only within small ligands. Partial flexibility in the receptor molecule is enabled by a few techniques. Hinge-bending motions of protein receptors domains are not addressed by these methods, although these types of transitions are significant, e.g., in enzymes activity. Our approach allows hinge induced motions to exist in either the receptor or the ligand molecules of diverse sizes. We allow domains/subdomains/group of atoms movements in either of the associating molecules. We achieve this by adapting a technique developed in Computer Vision and Robotics for the efficient recognition of partially occluded articulated objects. These types of objects consist of rigid parts which are connected by rotary joints (hinges). Our method is based on an extension and generalization of the Hough transform and the Geometric Hashing paradigms for rigid object recognition. We show experimental results obtained by the successful application of the algorithm to cases of bound and unbound molecular complexes, yielding fast matching times. While the "correct" molecular conformations of the known complexes are obtained with small RMS distances, additional, predictive good-fitting binding modes are generated as well. We conclude by discussing the algorithm's implications and extensions, as well as its application to investigations of protein structures in Molecular Biology and recognition problems in Computer Vision.


Subject(s)
Algorithms , Models, Biological , Models, Molecular , Proteins/chemistry , Carrier Proteins/chemistry , Carrier Proteins/metabolism , HIV Protease/chemistry , HIV Protease/metabolism , HIV Protease Inhibitors/chemistry , HIV Protease Inhibitors/metabolism , Ligands , Maltose/chemistry , Maltose/metabolism , Maltose-Binding Proteins , Methotrexate/chemistry , Methotrexate/metabolism , Oligopeptides/chemistry , Oligopeptides/metabolism , Protein Conformation , Proteins/metabolism , Tetrahydrofolate Dehydrogenase/chemistry , Tetrahydrofolate Dehydrogenase/metabolism
14.
J Mol Biol ; 271(5): 838-45, 1997 Sep 05.
Article in English | MEDLINE | ID: mdl-9299331

ABSTRACT

The structure of the complex of the chorismate mutase from the yeast Saccharomyces cerevisiae with a transition state analog is constructed using a suite of docking tools. The construction finds the best location for the active site in the enzyme, and the best orientation of the analog compound in the active site. The resulting complex shows extensive salt links and hydrogen bonds between the enzyme and the compound, including those mediated by water molecules. A network of polar interactions between amino acid residues is found to solidify the active site of the enzyme. The enzymatic mechanism suggested for a bacterial chorismate mutase, that the active site is by design capable of selecting an active conformer of the substrate, and of stabilizing the transition state, is apparently intact in the yeast enzyme. No direct evidence is found to support an alternative mechanism which involves specific catalytic groups, although the possibility is not eliminated. This finding reinforces the notion of a function being evolutionarily conserved via a common mechanism, rather than via sequential or structural homology.


Subject(s)
Chorismate Mutase/chemistry , Chorismic Acid/analogs & derivatives , Models, Molecular , Saccharomyces cerevisiae/enzymology , Binding Sites , Chorismate Mutase/metabolism , Chorismic Acid/metabolism , Dimerization
15.
Protein Sci ; 6(1): 53-64, 1997 Jan.
Article in English | MEDLINE | ID: mdl-9007976

ABSTRACT

Data sets of 362 structurally nonredundant protein-protein interfaces and of 57 symmetry-related oligomeric interfaces have been used to explore whether the hydrophobic effect that guides protein folding is also the main driving force for protein-protein associations. The buried nonpolar surface area has been used to measure the hydrophobic effect. Our analysis indicates that, although the hydrophobic effect plays a dominant role in protein-protein binding, it is not as strong as that observed in the interior of protein monomers. Comparison of interiors of the monomers with those of the interfaces reveals that, in general, the hydrophobic amino acids are more frequent in the interior of the monomers than in the interior of the protein-protein interfaces. On the other hand, a higher proportion of charged and polar residues are buried at the interfaces, suggesting that hydrogen bonds and ion pairs contribute more to the stability of protein binding than to that of protein folding. Moreover, comparison of the interior of the interfaces to protein surfaces indicates that the interfaces are poorer in polar/charged than the surfaces and are richer in hydrophobic residues. The interior of the interfaces appears to constitute a compromise between the stabilization contributed by the hydrophobic effect on the one hand and avoiding patches on the protein surfaces that are too hydrophobic on the other. Such patches would be unfavorable for the unassociated monomers in solution. We conclude that, although the types of interactions are similar between protein-protein interfaces and single-chain proteins overall, the contribution of the hydrophobic effect to protein-protein associations is not as strong as to protein folding. This implies that packing patterns and interatom, or interresidue, pairwise potential functions, derived from monomers, are not ideally suited to predicting and assessing ligand associations or design. These would perform adequately only in cases where the hydrophobic effect at the binding site is substantial.


Subject(s)
Proteins/metabolism , Amino Acids/analysis , Protein Binding , Protein Conformation , Protein Folding , Proteins/chemistry
16.
Protein Eng ; 10(10): 1109-22, 1997 Oct.
Article in English | MEDLINE | ID: mdl-9488136

ABSTRACT

The question of whether interchanges of spatially neighboring residues are coupled, or whether they change independently of each other, has been addressed repeatedly over the last few years. Utilizing a residue order-independent structural comparison tool, we investigated interchanges of spatially adjacent residue pairs in conserved 3D environments in globally dissimilar protein structures. We define spatially adjacent pairs to be non-local neighboring residues which are in spatial contact, though separated along the backbone, to exclude backbone effects. A dataset of unrelated structures is extensively compared, constructing a matrix of all 400x400 interchanges of residue pairs. Our study indicates that (i) interchanges of residues which are spatial neighbors are independent of each other. With the exception of a few pairs, the pattern of interchanges of pairs of adjacent residues resembles that expected from interchanges of single residues. However, clustering residues of similar characteristics, serves to enhance secondary trends. Hence, (ii) clustering the hydrophobic, aliphatic and, separately, the aromatic, and comparing them with the charged, and the polar, indicates that hydrophobic pairs are favorably replaced by hydrophobic, and charged/ polar by charged/polar. The most strongly conserved are the charged. Interestingly, the type of charge (like or opposite) plays no role. Interchanges between the hydrophobic and hydrophilic classes are unfavorable. (iii) Clustering by volume indicates that the most highly conserved are the (Small, Small) pairs. The least favorable are interchanges of the type (Small, Small) <--> (Large, Large). Interchanges of the type (Large, Small) <--> (Large, Large) are less favorable than (Large, Small) <--> (Small, Small). Compensatory interchanges of the type (Large, Small) <--> (Small, Large) are unfavorable. (iv) Inspection of the trends in the interchanges of the clustered small residues versus clustered large rigid, and separately versus clustered large flexible, illustrates clear differences. Consistently, within all hydrophobic, large and small, the flexible aliphatic differ from the more rigid aromatic. The flexible aliphatic residue pairs are unfavorably replaced by other residue types. Furthermore, (v) the unique properties of the aromatics, conferred by the electronic configuration of their benzene rings, are transformed into clear trends. Replacements of polar residues by aromatics, while unfavorable, are nevertheless consistently more favorable than into aliphatics. We address these issues and their direct implications to protein design and to fold recognition.


Subject(s)
Amino Acids/chemistry , Models, Chemical , Protein Conformation , Cluster Analysis , Surface Properties
17.
Protein Eng ; 9(12): 1103-19, 1996 Dec.
Article in English | MEDLINE | ID: mdl-9010924

ABSTRACT

We present an efficient technique for the comparison of protein structures. The algorithm uses a vector representation of the secondary structure elements and searches for spatial configurations of secondary structure elements in proteins. In such recurring protein folds, the order of the secondary structure elements in the protein chains is disregarded. The method is based on the geometric hashing paradigm and implements approaches originating in computer vision. It represents and matches the secondary structure element vectors in a 3-D translation and rotation invariant manner. The matching of a pair of proteins takes on average under 3 s on a Silicon Graphics Indigo2 workstation, allowing extensive all-against-all comparisons of the data set of non-redundant protein structures. Here we have carried out such a comparison for a data set of over 500 protein molecules. The detection of recurring topological and non-topological, secondary structure element order-independent protein folds may provide further insight into evolution. Moreover, as these recurring folding units are likely to be conformationally favourable, the availability of a data set of such topological motifs can serve as a rich input for threading routines. Below, we describe this rapid technique and the results it has obtained. While some of the obtained matches conserve the order of the secondary structure elements, others are entirely order independent. As an example, we focus on the results obtained for Che Y, a signal transduction protein, and on the profilin-beta-actin complex. The Che Y molecule is composed of a five-stranded, parallel beta-sheet flanked by five helices. Here we show its similarity with the Escherichia coli elongation factor, with L-arabinose binding protein, with haloalkane dehalogenase and with adenylate kinase. The profilin-beta-actin contains an antiparallel beta-pleated sheet with alpha-helical termini. Its similarities to lipase, fructose disphosphatase and beta-lactamase are displayed.


Subject(s)
Algorithms , Computer Simulation , Models, Chemical , Protein Structure, Secondary , Sequence Alignment/methods , Databases, Factual , Models, Molecular
18.
J Mol Biol ; 260(4): 604-20, 1996 Jul 26.
Article in English | MEDLINE | ID: mdl-8759323

ABSTRACT

While there are a number of structurally non-redundant datasets of protein monomers, there is none of protein-protein interfaces. Yet, the availability of such a dataset is expected to provide an added insight into a number of investigations. First and foremost among these is analyzing the interfaces to obtain their prevailing architectures, the forces that account for the protein-protein associations and their packing considerations. Their comparisons with those of the monomers are likely to shed additional light on protein-protein recognition on the one hand and on the folding of the polypeptide chain on the other. Docking simulations are also expected to benefit from the existence of such a dataset. A major stumbling block to the generation of a dataset of interfaces has been that the interface is composed of at least two chains. Furthermore, in the interfaces, each of the chains might be represented by non-contiguous pieces. Their order in the interfaces being compared might be different as well. This discontinuity stems from the definition of an interface. An interface consists of interacting residues between the chains, and those that are in their vicinity in the supporting scaffold, within a certain distance threshold. This necessarily yields unordered fragments, as well as isolated residues. Our novel, efficient, sequence-order-independent structural comparison technique is ideally suited to handle the task of the generation of a library of structurally non-redundant protein-protein interfaces. As it is computer-vision based, it views atoms as collections of points in space, disregarding their chain connectivity. In this work, 351 interface-families are created. Comparisons of the interfaces, and separately, of the chains which contribute to them, yield some interesting cases. In one of the cases, while two interfaces are similar, the structure of only one of the two chains is similar between the two complexes. The structure of the second chain of the first complex differs from that of the second chain of the second complex. Here the structure of the cleft in the first chain dictates the specific binding interactions. In another case, while the interfaces in the two complexes are similar, both chains composing them differ between the complexes. Lastly, the chains composing the complexes are similar, but the interfaces are dissimilar, providing a set of data for investigations of the favorable orientations of protein-protein associations.


Subject(s)
Databases, Factual , Models, Molecular , Proteins/chemistry , Algorithms , Protein Folding , Proteins/classification , Proteins/metabolism
19.
Crit Rev Biochem Mol Biol ; 31(2): 127-52, 1996 Apr.
Article in English | MEDLINE | ID: mdl-8740525

ABSTRACT

Protein structures generally consist of favorable folding motifs formed by specific arrangements of secondary structure elements. Similar architectures can be adopted by different amino acids sequences, although the details of the structures vary. It has long been known that despite the sequence variability, there is a striking preferential conservation of the hydrophobic character of the amino acids at the buried positions of these folding motifs. Differences in the sizes of the side-chains are accommodated by movements of the secondary structure elements with respect to each other, leading to compact packing. Scanning protein-protein interfaces reveals that similar architectures are also observed at and around their interacting surfaces, with preservation of the hydrophobic character, although not to the same extent. The general forces that determine the origin of the native structures of proteins have been investigated intensively. The major non-bonded forces operating on a protein chain as it folds into a three-dimensional structure are likely to be packing, the hydrophobic effect, and electrostatic interactions. While the substantial hydrophobic forces lead to a compact conformation, they are also nonspecific and cannot serve as a guide to a conformationally unique structure. For the general folding problem, it thus appears that packing is a prime candidate for determining a particular fold. Specific hydrogen-bonding patterns and salt-bridges have also been proposed to play a role. Inspection of protein-protein interfaces reveals that the hallmarks governing single chain protein structures also determine their interactions, suggesting that similar principles underlie protein folding and protein-protein associations. This review focuses on some aspects of protein-protein interfaces, particularly on the architectures and their interactions. These are compared with those present in protein monomers. This task is facilitated by the recently compiled, non-redundant structural dataset of protein-protein interfaces derived from the crystallographic database. In particular, although current view holds that protein-protein interfaces and interactions are similar to those found in the conformations of single-chain proteins, this review brings forth the differences as well. Not only is it logical that such differences would exist, it is these differences that further illuminate protein folding on the one hand and protein-protein recognition on the other. These are also particularly important in considering inhibitor (ligand) design.


Subject(s)
Protein Binding , Protein Structure, Tertiary , Amino Acids/chemistry , Hydrogen Bonding , Protein Folding , Surface Properties
20.
J Mol Biol ; 256(5): 924-38, 1996 Mar 15.
Article in English | MEDLINE | ID: mdl-8601843

ABSTRACT

Here we study the pattern of amino acid interchanges at spatially, locally conserved regions in globally dissimilar and unrelated proteins. By using a method which completely separates the amino acid sequence from its respective structure, this work addresses the question of which properties of the amino acids are the most crucial for the stability of conserved structural motifs. The proteins are taken from a structurally non-redundant dataset. The spatially conserved substructural motifs are defined as consisting of a "large enough" number of Calpha atoms found to provide a geometric match between two proteins, regardless of the order of the Calpha atoms in the sequence, or of the sequence composition of the substructures. This approach can apply to proteins with little or no sequence similarity but with sufficient structural similarity, and is unique in its ability to handle local, non-topological matches between pairs of dissimilar proteins. The method uses a computer-version based algorithm, the Geometric Hashing. Since the Geometric Hashing ignores sequence information it lends itself to answer the question posed above. The interchanges at geometrically similar positions that have been obtained with our method demonstrate the expected behaviour. Yet, a closer inspection reveals some distant characteristics, as compared with interchanges based upon sequence-order based techniques, or from energy-contact-based considerations. First, a pronounced division of the amino acids into two classes is displayed: Lys, Glu, Arg, Gln, Asp, Asn, Pro, Gly, Thr, Ser and His on the one hand, and Ile, Val, Leu, Phe, Met, Tyr, Trp, Cys and Ala on the other. These groups further cluster into subgroups: Lys, Glu, Arg, Gln; Asp Asn; Pro, Gly; Ile, Val, Leu, Phe. The other amino acids stand alone. Analysis of the conservation among amino acids indicates proline to be consistently, by far, the most conserved. Next are Asp, Glu, Lys and Gly. Cys is also highly conserved. Interestingly, oppositely charged amino acids are interchanged roughly as frequently as those of the same charge. These observations can be explained in terms of the three-dimensional structures of the proteins. Most of all, there is a clear distinction between residues which prefer to be on the protein surfaces, compared to those frequently buried in the interiors. Analysis of the interchanges indicates their low information content. This, together with the separation into two groups, suggest that the predictive value of the spatial positions of the Calpha+ atoms is not much greater than the sequence alone, aside from their hydrophobicity/hydrophillicity classification.


Subject(s)
Proteins/chemistry , Proteins/genetics , Amino Acid Sequence , Animals , Cluster Analysis , Conserved Sequence , Databases, Factual , Humans , Molecular Sequence Data , Molecular Structure , Protein Conformation , Sequence Alignment/methods , Sequence Alignment/statistics & numerical data , Sequence Homology, Amino Acid , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...