Search | VHL Regional Portal

Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank.

J Bioinform Comput Biol ; 11(1): 1340008, 2013 Feb.

Article in English | MEDLINE | ID: mdl-23427990

ABSTRACT

Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).

Subject(s)

Carrier Proteins/chemistry , Carrier Proteins/ultrastructure , Data Mining/methods , Databases, Protein , Models, Chemical , Models, Molecular , Sequence Analysis, Protein/methods , Amino Acid Sequence , Computer Simulation , Molecular Sequence Data , Protein Conformation

BriX: a database of protein building blocks for structural analysis, modeling and design.

Vanhee, Peter; Verschueren, Erik; Baeten, Lies; Stricher, Francois; Serrano, Luis; Rousseau, Frederic; Schymkowitz, Joost.

Nucleic Acids Res ; 39(Database issue): D435-42, 2011 Jan.

Article in English | MEDLINE | ID: mdl-20972210

ABSTRACT

High-resolution structures of proteins remain the most valuable source for understanding their function in the cell and provide leads for drug design. Since the availability of sufficient protein structures to tackle complex problems such as modeling backbone moves or docking remains a problem, alternative approaches using small, recurrent protein fragments have been employed. Here we present two databases that provide a vast resource for implementing such fragment-based strategies. The BriX database contains fragments from over 7000 non-homologous proteins from the Astral collection, segmented in lengths from 4 to 14 residues and clustered according to structural similarity, summing up to a content of 2 million fragments per length. To overcome the lack of loops classified in BriX, we constructed the Loop BriX database of non-regular structure elements, clustered according to end-to-end distance between the regular residues flanking the loop. Both databases are available online (http://brix.crg.es) and can be accessed through a user-friendly web-interface. For high-throughput queries a web-based API is provided, as well as full database downloads. In addition, two exciting applications are provided as online services: (i) user-submitted structures can be covered on the fly with BriX classes, representing putative structural variation throughout the protein and (ii) gaps or low-confidence regions in these structures can be bridged with matching fragments.

Subject(s)

Databases, Protein , Protein Conformation , Models, Molecular , Proteins/chemistry , User-Computer Interface

PepX: a structural database of non-redundant protein-peptide complexes.

Vanhee, Peter; Reumers, Joke; Stricher, Francois; Baeten, Lies; Serrano, Luis; Schymkowitz, Joost; Rousseau, Frederic.

Nucleic Acids Res ; 38(Database issue): D545-51, 2010 Jan.

Article in English | MEDLINE | ID: mdl-19880386

ABSTRACT

Although protein-peptide interactions are estimated to constitute up to 40% of all protein interactions, relatively little information is available for the structural details of these interactions. Peptide-mediated interactions are a prime target for drug design because they are predominantly present in signaling and regulatory networks. A reliable data set of nonredundant protein-peptide complexes is indispensable as a basis for modeling and design, but current data sets for protein-peptide interactions are often biased towards specific types of interactions or are limited to interactions with small ligands. In PepX (http://pepx.switchlab.org), we have designed an unbiased and exhaustive data set of all protein-peptide complexes available in the Protein Data Bank with peptide lengths up to 35 residues. In addition, these complexes have been clustered based on their binding interfaces rather than sequence homology, providing a set of structurally diverse protein-peptide interactions. The final data set contains 505 unique protein-peptide interface clusters from 1431 complexes. Thorough annotation of each complex with both biological and structural information facilitates searching for and browsing through individual complexes and clusters. Moreover, we provide an additional source of data for peptide design by annotating peptides with naturally occurring backbone variations using fragment clusters from the BriX database.

Subject(s)

Computational Biology/methods , Databases, Protein , Protein Interaction Mapping/methods , Animals , Computational Biology/trends , Humans , Information Storage and Retrieval/methods , Internet , Ligands , Peptides/chemistry , Protein Structure, Tertiary , Proteins/chemistry , Signal Transduction , Software

Protein-peptide interactions adopt the same structural motifs as monomeric protein folds.

Vanhee, Peter; Stricher, Francois; Baeten, Lies; Verschueren, Erik; Lenaerts, Tom; Serrano, Luis; Rousseau, Frederic; Schymkowitz, Joost.

Structure ; 17(8): 1128-36, 2009 Aug 12.

Article in English | MEDLINE | ID: mdl-19679090

ABSTRACT

We compared the modes of interaction between protein-peptide interfaces and those observed within monomeric proteins and found surprisingly few differences. Over 65% of 731 protein-peptide interfaces could be reconstructed within 1 A RMSD using solely fragment interactions occurring in monomeric proteins. Interestingly, more than 80% of interacting fragments used in reconstructing a protein-peptide binding site were obtained from monomeric proteins of an entirely different structural classification, with an average sequence identity below 15%. Nevertheless, geometric properties perfectly match the interaction patterns observed within monomeric proteins. We show the usefulness of our approach by redesigning the interaction scaffold of nine protein-peptide complexes, for which five of the peptides can be modeled within 1 A RMSD of the original peptide position. These data suggest that the wealth of structural data on monomeric proteins could be harvested to model protein-peptide interactions and, more importantly, that sequence homology is no prerequisite.

Subject(s)

Peptides/chemistry , Protein Folding , Protein Interaction Mapping/methods , Proteins/chemistry , Algorithms , Binding Sites , Databases, Protein , Models, Molecular , Protein Binding , Protein Structure, Secondary , Protein Structure, Tertiary , Structure-Activity Relationship

Reconstruction of protein backbones from the BriX collection of canonical protein fragments.

Baeten, Lies; Reumers, Joke; Tur, Vicente; Stricher, François; Lenaerts, Tom; Serrano, Luis; Rousseau, Frederic; Schymkowitz, Joost.

PLoS Comput Biol ; 4(5): e1000083, 2008 May 23.

Article in English | MEDLINE | ID: mdl-18483555

ABSTRACT

As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

Subject(s)

Databases, Protein , Models, Chemical , Models, Molecular , Peptide Fragments/chemistry , Proteins/chemistry , Proteins/ultrastructure , Sequence Analysis, Protein/methods , Amino Acid Sequence , Computer Simulation , Molecular Sequence Data , Protein Conformation

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL