Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 73
Filter
Add more filters










Publication year range
1.
Proteomics ; 23(17): e2300219, 2023 09.
Article in English | MEDLINE | ID: mdl-37667816

ABSTRACT

Structural characterization of protein interactions is essential for our ability to understand and modulate physiological processes. Computational approaches to modeling of protein complexes provide structural information that far exceeds capabilities of the existing experimental techniques. Protein structure prediction in general, and prediction of protein interactions in particular, has been revolutionized by the rapid progress in Deep Learning techniques. The work of Schweke et al. (Proteomics 2023, 23, 2200323) presents a community-wide study of an important problem of distinguishing physiological protein-protein complexes/interfaces (experimentally determined or modeled) from non-physiological ones. The authors designed and generated a large benchmark set of physiological and non-physiological homodimeric complexes, and evaluated a large set of scoring functions, as well as AlphaFold predictions, on their ability to discriminate the non-physiological interfaces. The problem of separating physiological interfaces from non-physiological ones is very difficult, largely due to the lack of a clear distinction between the two categories in a crowded environment inside a living cell. Still, the ability to identify key physiologically significant interfaces in the variety of possible configurations of a protein-protein complex is important. The study presents a major data resource and methodological development in this important direction for molecular and cellular biology.


Subject(s)
Benchmarking , Proteomics
2.
Front Mol Biosci ; 9: 1031225, 2022.
Article in English | MEDLINE | ID: mdl-36425657

ABSTRACT

Association of proteins to a significant extent is determined by their geometric complementarity. Large-scale recognition factors, which directly relate to the funnel-like intermolecular energy landscape, provide important insights into the basic rules of protein recognition. Previously, we showed that simple energy functions and coarse-grained models reveal major characteristics of the energy landscape. As new computational approaches increasingly address structural modeling of a whole cell at the molecular level, it becomes important to account for the crowded environment inside the cell. The crowded environment drastically changes protein recognition properties, and thus significantly alters the underlying energy landscape. In this study, we addressed the effect of crowding on the protein binding funnel, focusing on the size of the funnel. As crowders occupy the funnel volume, they make it less accessible to the ligands. Thus, the funnel size, which can be defined by ligand occupancy, is generally reduced with the increase of the crowders concentration. This study quantifies this reduction for different concentration of crowders and correlates this dependence with the structural details of the interacting proteins. The results provide a better understanding of the rules of protein association in the crowded environment.

3.
Protein Sci ; 31(12): e4481, 2022 12.
Article in English | MEDLINE | ID: mdl-36281025

ABSTRACT

Structural information of protein-protein interactions is essential for characterization of life processes at the molecular level. While a small fraction of known protein interactions has experimentally determined structures, computational modeling of protein complexes (protein docking) has to fill the gap. The Dockground resource (http://dockground.compbio.ku.edu) provides a collection of datasets for the development and testing of protein docking techniques. Currently, Dockground contains datasets for the bound and the unbound (experimentally determined and simulated) protein structures, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative docking. The Dockground bound proteins dataset is a core set, from which other Dockground datasets are generated. It is devised as a relational PostgreSQL database containing information on experimentally determined protein-protein complexes. This report on the Dockground resource describes current status of the datasets, new automated update procedures and further development of the core datasets. We also present a new Dockground interactive web interface, which allows search by various parameters, such as release date, multimeric state, complex type, structure resolution, and so on, visualization of the search results with a number of customizable parameters, as well as downloadable datasets with predefined levels of sequence and structure redundancy.


Subject(s)
Proteins , Software , Proteins/chemistry , Computer Simulation , Protein Binding , Molecular Docking Simulation , Protein Conformation , Computational Biology/methods
4.
Proc Natl Acad Sci U S A ; 119(41): e2210249119, 2022 10 11.
Article in English | MEDLINE | ID: mdl-36191203

ABSTRACT

Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution.


Subject(s)
Molecular Dynamics Simulation , Proteins , Algorithms , Biophysical Phenomena , Kinetics , Monte Carlo Method
5.
J Mol Biol ; 434(11): 167608, 2022 06 15.
Article in English | MEDLINE | ID: mdl-35662458

ABSTRACT

Rapid progress in structural modeling of proteins and their interactions is powered by advances in knowledge-based methodologies along with better understanding of physical principles of protein structure and function. The pool of structural data for modeling of proteins and protein-protein complexes is constantly increasing due to the rapid growth of protein interaction databases and Protein Data Bank. The GWYRE (Genome Wide PhYRE) project capitalizes on these developments by advancing and applying new powerful modeling methodologies to structural modeling of protein-protein interactions and genetic variation. The methods integrate knowledge-based tertiary structure prediction using Phyre2 and quaternary structure prediction using template-based docking by a full-structure alignment protocol to generate models for binary complexes. The predictions are incorporated in a comprehensive public resource for structural characterization of the human interactome and the location of human genetic variants. The GWYRE resource facilitates better understanding of principles of protein interaction and structure/function relationships. The resource is available at http://www.gwyre.org.


Subject(s)
Protein Interaction Mapping , Proteins , Software , Binding Sites , Computational Biology/methods , Databases, Protein , Humans , Molecular Docking Simulation , Protein Binding , Protein Interaction Mapping/methods , Proteins/chemistry
6.
PLoS One ; 17(5): e0267531, 2022.
Article in English | MEDLINE | ID: mdl-35580077

ABSTRACT

Membrane proteins are significantly underrepresented in Protein Data Bank despite their essential role in cellular mechanisms and the major progress in experimental protein structure determination. Thus, computational approaches are especially valuable in the case of membrane proteins and their assemblies. The main focus in developing structure prediction techniques has been on soluble proteins, in part due to much greater availability of the structural data. Currently, structure prediction of protein complexes (protein docking) is a well-developed field of study. However, the generic protein docking approaches are not optimal for the membrane proteins because of the differences in physicochemical environment and the spatial constraints imposed by the membranes. Thus, docking of the membrane proteins requires specialized computational methods. Development and benchmarking of the membrane protein docking approaches has to be based on high-quality sets of membrane protein complexes. In this study we present a new dataset of 456 non-redundant alpha helical binary interfaces. The set is significantly larger and more representative than the previously developed sets. In the future, it will become the basis for the development of docking and scoring benchmarks, similar to the ones for soluble proteins in the Dockground resource http://dockground.compbio.ku.edu.


Subject(s)
Benchmarking , Membrane Proteins , Computational Biology/methods , Databases, Protein , Molecular Docking Simulation , Protein Binding , Software
7.
Proteins ; 90(6): 1259-1266, 2022 06.
Article in English | MEDLINE | ID: mdl-35072956

ABSTRACT

Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either computationally too expensive or algorithmically impossible to include in the global scan. Development and validation of scoring methodologies are often performed on scoring benchmark sets (docking decoys) which offer concise and nonredundant representation of the global docking scan output for a large and diverse set of protein-protein complexes. Two such protein-protein scoring benchmarks were built for the Dockground resource, which contains various datasets for the development and testing of protein docking methodologies. One set was generated based on the Dockground unbound docking benchmark 4, and the other based on protein models from the Dockground model-model benchmark 2. The docking decoys were designed to reflect the reality of the real-case docking applications (e.g., correct docking predictions defined as near-native rather than native structures), and to minimize applicability of approaches not directly related to the development of scoring functions (reducing clustering of predictions in the binding funnel and disparity in structural quality of the near-native and nonnative matches). The sets were further characterized by the source organism and the function of the protein-protein complexes. The sets, freely available to the research community on the Dockground webpage, present a unique, user-friendly resource for the developing and testing of protein-protein scoring approaches.


Subject(s)
Benchmarking , Proteins , Molecular Docking Simulation , Protein Binding , Protein Conformation , Proteins/chemistry
8.
Protein Sci ; 30(2): 381-390, 2021 02.
Article in English | MEDLINE | ID: mdl-33166001

ABSTRACT

Structures of proteins and protein-protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein-protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure-based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue-residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein-protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein-protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking-discrimination of the intramolecular information not directly relevant to docking.


Subject(s)
Databases, Protein , Protein Interaction Mapping , Proteins/chemistry , Sequence Alignment , Software , Amino Acid Sequence , Proteins/genetics
9.
Bioinformatics ; 37(4): 497-505, 2021 05 01.
Article in English | MEDLINE | ID: mdl-32960948

ABSTRACT

MOTIVATION: Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. RESULTS: We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. AVAILABILITYAND IMPLEMENTATION: The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Data Mining , Machine Learning , Proteins , PubMed , Support Vector Machine
10.
Curr Opin Struct Biol ; 64: 160-165, 2020 10.
Article in English | MEDLINE | ID: mdl-32836051

ABSTRACT

Current developments in protein docking aim at improvement of applicability, accuracy and utility of modeling macromolecular complexes. The challenges include the need for greater emphasis on protein docking to molecules of different types, proper accounting for conformational flexibility upon binding, new promising methodologies based on residue co-evolution and deep learning, affinity prediction, and further development of fully automated docking servers. Importantly, new developments increasingly focus on realistic modeling of protein interactions in vivo, including crowded environment inside a cell, which involves multiple transient encounters, and propagating the system in time. This opinion paper offers the author's perspective on these challenges in structural modeling of protein interactions and the future of protein docking.


Subject(s)
Proteins , Molecular Docking Simulation , Protein Binding , Proteins/metabolism
11.
Methods Mol Biol ; 2165: 289-300, 2020.
Article in English | MEDLINE | ID: mdl-32621232

ABSTRACT

Databases of protein-protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the DOCKGROUND resource ( http://dockground.compbio.ku.edu ) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model-model complexes, and docking decoys. The datasets are available to the user community through a Web interface.


Subject(s)
Molecular Docking Simulation/methods , Protein Conformation , Software , Benchmarking , Molecular Docking Simulation/standards , Protein Binding
12.
Proteins ; 88(9): 1180-1188, 2020 09.
Article in English | MEDLINE | ID: mdl-32170770

ABSTRACT

Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.


Subject(s)
Benchmarking/statistics & numerical data , Molecular Docking Simulation , Proteins/chemistry , Software , Amino Acid Sequence , Binding Sites , Databases, Protein , Protein Binding , Protein Structure, Secondary
13.
Proteins ; 88(8): 1070-1081, 2020 08.
Article in English | MEDLINE | ID: mdl-31994759

ABSTRACT

Comparative docking is based on experimentally determined structures of protein-protein complexes (templates), following the paradigm that proteins with similar sequences and/or structures form similar complexes. Modeling utilizing structure similarity of target monomers to template complexes significantly expands structural coverage of the interactome. Template-based docking by structure alignment can be performed for the entire structures or by aligning targets to the bound interfaces of the experimentally determined complexes. Systematic benchmarking of docking protocols based on full and interface structure alignment showed that both protocols perform similarly, with top 1 docking success rate 26%. However, in terms of the models' quality, the interface-based docking performed marginally better. The interface-based docking is preferable when one would suspect a significant conformational change in the full protein structure upon binding, for example, a rearrangement of the domains in multidomain proteins. Importantly, if the same structure is selected as the top template by both full and interface alignment, the docking success rate increases 2-fold for both top 1 and top 10 predictions. Matching structural annotations of the target and template proteins for template detection, as a computationally less expensive alternative to structural alignment, did not improve the docking performance. Sophisticated remote sequence homology detection added templates to the pool of those identified by structure-based alignment, suggesting that for practical docking, the combination of the structure alignment protocols and the remote sequence homology detection may be useful in order to avoid potential flaws in generation of the structural templates library.


Subject(s)
Molecular Docking Simulation , Peptides/chemistry , Proteins/chemistry , Software , Amino Acid Sequence , Animals , Benchmarking , Binding Sites , Dogs , Escherichia coli/chemistry , Humans , Ligands , Peptides/metabolism , Protein Binding , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Interaction Domains and Motifs , Protein Interaction Mapping , Protein Multimerization , Proteins/metabolism , Research Design , Structural Homology, Protein , Thermodynamics
15.
J Mol Biol ; 431(13): 2460-2466, 2019 06 14.
Article in English | MEDLINE | ID: mdl-31075275

ABSTRACT

PhyreRisk is an open-access, publicly accessible web application for interactively bridging genomic, proteomic and structural data facilitating the mapping of human variants onto protein structures. A major advance over other tools for sequence-structure variant mapping is that PhyreRisk provides information on 20,214 human canonical proteins and an additional 22,271 alternative protein sequences (isoforms). Specifically, PhyreRisk provides structural coverage (partial or complete) for 70% (14,035 of 20,214 canonical proteins) of the human proteome, by storing 18,874 experimental structures and 84,818 pre-built models of canonical proteins and their isoforms generated using our in house Phyre2. PhyreRisk reports 55,732 experimentally, multi-validated protein interactions from IntAct and 24,260 experimental structures of protein complexes. Another major feature of PhyreRisk is that, rather than presenting a limited set of precomputed variant-structure mapping of known genetic variants, it allows the user to explore novel variants using, as input, genomic coordinates formats (Ensembl, VCF, reference SNP ID and HGVS notations) and Human Build GRCh37 and GRCh38. PhyreRisk also supports mapping variants using amino acid coordinates and searching for genes or proteins of interest. PhyreRisk is designed to empower researchers to translate genetic data into protein structural information, thereby providing a more comprehensive appreciation of the functional impact of variants. PhyreRisk is freely available at http://phyrerisk.bc.ic.ac.uk.


Subject(s)
Computational Biology/methods , Genetic Variation , Proteins/chemistry , Genomics , Humans , Protein Conformation , Proteins/genetics , Proteins/metabolism , Proteomics , Software
16.
Curr Opin Struct Biol ; 55: 59-65, 2019 04.
Article in English | MEDLINE | ID: mdl-30999240

ABSTRACT

Structural modeling of a cell is an evolving strategic direction in computational structural biology. It takes advantage of new powerful modeling techniques, deeper understanding of fundamental principles of molecular structure and assembly, and rapid growth of the amount of structural data generated by experimental techniques. Key modeling approaches to principal types of macromolecular assemblies in a cell already exist. The main challenge, along with the further development of these modeling approaches, is putting them together in a consistent, unified whole cell model. This opinion piece addresses the fundamental aspects of modeling macromolecular assemblies in a cell, and the state-of-the-art in modeling of the principal types of such assemblies.


Subject(s)
Computational Biology/methods , Macromolecular Substances/chemistry , Models, Molecular , Molecular Structure
17.
Proteins ; 87(3): 245-253, 2019 03.
Article in English | MEDLINE | ID: mdl-30520123

ABSTRACT

Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, and localization) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains-biological process, molecular function, and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound, and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy.


Subject(s)
Computational Biology , Gene Ontology , Protein Conformation , Proteins/genetics , Binding Sites/genetics , Databases, Protein , Humans , Models, Molecular , Molecular Docking Simulation , Protein Binding/genetics , Protein Interaction Mapping , Protein Interaction Maps/genetics , Proteins/chemistry , Software , Structural Homology, Protein
18.
J Comput Chem ; 39(24): 2012-2021, 2018 09 15.
Article in English | MEDLINE | ID: mdl-30226647

ABSTRACT

Protein-protein docking procedures typically perform the global scan of the proteins relative positions, followed by the local refinement of the putative matches. Because of the size of the search space, the global scan is usually implemented as rigid-body search, using computationally inexpensive intermolecular energy approximations. An adequate refinement has to take into account structural flexibility. Since the refinement performs conformational search of the interacting proteins, it is extremely computationally challenging, given the enormous amount of the internal degrees of freedom. Different approaches limit the search space by restricting the search to the side chains, rotameric states, coarse-grained structure representation, principal normal modes, and so on. Still, even with the approximations, the refinement presents an extreme computational challenge due to the very large number of the remaining degrees of freedom. Given the complexity of the search space, the advantage of the exhaustive search is obvious. The obstacle to such search is computational feasibility. However, the growing computational power of modern computers, especially due to the increasing utilization of Graphics Processing Unit (GPU) with large amount of specialized computing cores, extends the ranges of applicability of the brute-force search methods. This proof-of-concept study demonstrates computational feasibility of an exhaustive search of side-chain conformations in protein pocking. The procedure, implemented on the GPU architecture, was used to generate the optimal conformations in a large representative set of protein-protein complexes. © 2018 Wiley Periodicals, Inc.


Subject(s)
Algorithms , Computational Biology , Protein Conformation , Proteins/chemistry , Feasibility Studies , Protein Binding
19.
Biophys J ; 115(5): 809-821, 2018 09 04.
Article in English | MEDLINE | ID: mdl-30122295

ABSTRACT

The energy function is the key component of protein modeling methodology. This work presents a semianalytical approach to the development of contact potentials for protein structure modeling. Residue-residue and atom-atom contact energies were derived by maximizing the probability of observing native sequences in a nonredundant set of protein structures. The optimization task was formulated as an inverse statistical mechanics problem applied to the Potts model. Its solution by pseudolikelihood maximization provides consistent estimates of coupling constants at atomic and residue levels. The best performance was achieved when interacting atoms were grouped according to their physicochemical properties. For individual protein structures, the performance of the contact potentials in distinguishing near-native structures from the decoys is similar to the top-performing scoring functions. The potentials also yielded significant improvement in the protein docking success rates. The potentials recapitulated experimentally determined protein stability changes upon point mutations and protein-protein binding affinities. The approach offers a different perspective on knowledge-based potentials and may serve as the basis for their further development.


Subject(s)
Models, Molecular , Proteins/chemistry , Proteins/metabolism , Likelihood Functions , Point Mutation , Protein Conformation , Protein Stability , Proteins/genetics , Thermodynamics
20.
J Comput Aided Mol Des ; 32(7): 769-779, 2018 07.
Article in English | MEDLINE | ID: mdl-30003468

ABSTRACT

Modulating protein interaction pathways may lead to the cure of many diseases. Known protein-protein inhibitors bind to large pockets on the protein-protein interface. Such large pockets are detected also in the protein-protein complexes without known inhibitors, making such complexes potentially druggable. The inhibitor-binding site is primary defined by the side chains that form the largest pocket in the protein-bound conformation. Low-resolution ligand docking shows that the success rate for the protein-bound conformation is close to the one for the ligand-bound conformation, and significantly higher than for the apo conformation. The conformational change on the protein interface upon binding to the other protein results in a pocket employed by the ligand when it binds to that interface. This proof-of-concept study suggests that rather than using computational pocket-opening procedures, one can opt for an experimentally determined structure of the target co-crystallized protein-protein complex as a starting point for drug design.


Subject(s)
Molecular Docking Simulation , Proteins/antagonists & inhibitors , Proteins/chemistry , Binding Sites , Crystallization , Databases, Protein , Drug Design , Ligands , Proof of Concept Study , Protein Binding , Protein Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...