Search | VHL Regional Portal

Potentials of mean force for protein structure prediction vindicated, formalized and generalized.

Hamelryck, Thomas; Borg, Mikael; Paluszewski, Martin; Paulsen, Jonas; Frellsen, Jes; Andreetta, Christian; Boomsma, Wouter; Bottaro, Sandro; Ferkinghoff-Borg, Jesper.

PLoS One ; 5(11): e13714, 2010 Nov 10.

Article in English | MEDLINE | ID: mdl-21103041

ABSTRACT

Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances--so-called "potentials of mean force" (PMFs)--have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state--a necessary component of these potentials--is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities "reference ratio distributions" deriving from the application of the "reference ratio method." This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.

Subject(s)

Algorithms , Computational Biology/methods , Protein Conformation , Protein Folding , Hydrogen Bonding , Models, Molecular , Reproducibility of Results , Thermodynamics

Beyond rotamers: a generative, probabilistic model of side chains in proteins.

Harder, Tim; Boomsma, Wouter; Paluszewski, Martin; Frellsen, Jes; Johansson, Kristoffer E; Hamelryck, Thomas.

BMC Bioinformatics ; 11: 306, 2010 Jun 05.

Article in English | MEDLINE | ID: mdl-20525384

ABSTRACT

BACKGROUND: Accurately covering the conformational space of amino acid side chains is essential for important applications such as protein design, docking and high resolution structure prediction. Today, the most common way to capture this conformational space is through rotamer libraries - discrete collections of side chain conformations derived from experimentally determined protein structures. The discretization can be exploited to efficiently search the conformational space. However, discretizing this naturally continuous space comes at the cost of losing detailed information that is crucial for certain applications. For example, rigorously combining rotamers with physical force fields is associated with numerous problems. RESULTS: In this work we present BASILISK: a generative, probabilistic model of the conformational space of side chains that makes it possible to sample in continuous space. In addition, sampling can be conditional upon the protein's detailed backbone conformation, again in continuous space - without involving discretization. CONCLUSIONS: A careful analysis of the model and a comparison with various rotamer libraries indicates that the model forms an excellent, fully continuous model of side chain conformational space. We also illustrate how the model can be used for rigorous, unbiased sampling with a physical force field, and how it improves side chain prediction when used as a pseudo-energy term. In conclusion, BASILISK is an important step forward on the way to a rigorous probabilistic description of protein structure in continuous space and in atomic detail.

Subject(s)

Models, Statistical , Proteins/chemistry , Models, Molecular , Protein Conformation

Mocapy++--a toolkit for inference and learning in dynamic Bayesian networks.

Paluszewski, Martin; Hamelryck, Thomas.

BMC Bioinformatics ; 11: 126, 2010 Mar 12.

Article in English | MEDLINE | ID: mdl-20226024

ABSTRACT

BACKGROUND: Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations). RESULTS: The program package is freely available under the GNU General Public Licence (GPL) from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. CONCLUSIONS: Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.

Subject(s)

Bayes Theorem , Software , Models, Statistical , Protein Conformation , Proteins/chemistry , RNA/chemistry

Applying Undertaker to quality assessment.

Archie, John G; Paluszewski, Martin; Karplus, Kevin.

Proteins ; 77 Suppl 9: 191-5, 2009.

Article in English | MEDLINE | ID: mdl-19639637

ABSTRACT

Our group tested three quality assessment functions in CASP8: a function which used only distance constraints derived from alignments (SAM-T08-MQAO), a function which added other single-model terms to the distance constraints (SAM-T08-MQAU), and a function which used both single-model and consensus terms (SAM-T08-MQAC). We analyzed the functions both for ranking models for a single target and for producing an accurate estimate of GDT_TS. Our functions were optimized for the ranking problem, so are perhaps more appropriate for metaserver applications than for providing trustworthiness estimates for single models. On the CASP8 test, the functions with more terms performed better. The MQAC consensus method was substantially better than either single-model function, and the MQAU function was substantially better than the MQAO function that used only constraints from alignments.

Subject(s)

Computational Biology/methods , Proteins/chemistry , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Models, Molecular , Protein Conformation , Software

Model quality assessment using distance constraints from alignments.

Paluszewski, Martin; Karplus, Kevin.

Proteins ; 75(3): 540-9, 2009 May 15.

Article in English | MEDLINE | ID: mdl-19003987

ABSTRACT

Given a set of alternative models for a specific protein sequence, the model quality assessment (MQA) problem asks for an assignment of scores to each model in the set. A good MQA program assigns these scores such that they correlate well with real quality of the models, ideally scoring best that model which is closest to the true structure. In this article, we present a new approach for addressing the MQA problem. It is based on distance constraints extracted from alignments to templates of known structure, and is implemented in the Undertaker program for protein structure prediction. One novel feature is that we extract noncontact constraints as well as contact constraints. We describe how the distance constraint extraction is done and we show how they can be used to address the MQA problem. We have compared our method on CASP7 targets and the results show that our method is at least comparable with the best MQA methods that were assessed at CASP7. We also propose a new evaluation measure, Kendall's tau, that is more interpretable than conventional measures used for evaluating MQA methods (Pearson's r and Spearman's rho). We show clear examples where Kendall's tau agrees much more with our intuition of a correct MQA, and we therefore propose that Kendall's tau be used for future CASP MQA assessments.

Subject(s)

Algorithms , Computational Biology/methods , Proteins/chemistry , Caspase 7/chemistry , Computer Simulation , Humans , Models, Molecular , Protein Conformation , Reproducibility of Results

Reconstructing protein structure from solvent exposure using tabu search.

Paluszewski, Martin; Hamelryck, Thomas; Winter, Pawel.

Algorithms Mol Biol ; 1: 20, 2006 Oct 27.

Article in English | MEDLINE | ID: mdl-17069644

ABSTRACT

BACKGROUND: A new, promising solvent exposure measure, called half-sphere-exposure (HSE), has recently been proposed. Here, we study the reconstruction of a protein's Calpha trace solely from structure-derived HSE information. This problem is of relevance for de novo structure prediction using predicted HSE measure. For comparison, we also consider the well-established contact number (CN) measure. We define energy functions based on the HSE- or CN-vectors and minimize them using two conformational search heuristics: Monte Carlo simulation (MCS) and tabu search (TS). While MCS has been the dominant conformational search heuristic in literature, TS has been applied only a few times. To discretize the conformational space, we use lattice models with various complexity. RESULTS: The proposed TS heuristic with a novel tabu definition generally performs better than MCS for this problem. Our experiments show that, at least for small proteins (up to 35 amino acids), it is possible to reconstruct the protein backbone solely from the HSE or CN information. In general, the HSE measure leads to better models than the CN measure, as judged by the RMSD and the angle correlation with the native structure. The angle correlation, a measure of structural similarity, evaluates whether equivalent residues in two structures have the same general orientation. Our results indicate that the HSE measure is potentially very useful to represent solvent exposure in protein structure prediction, design and simulation.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL