Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
J Chem Inf Model ; 58(8): 1625-1637, 2018 08 27.
Article in English | MEDLINE | ID: mdl-30036062

ABSTRACT

Water molecules are of great importance for the correct representation of ligand binding interactions. Throughout the last years, water molecules and their integration into drug design strategies have received increasing attention. Nowadays a variety of tools are available to place and score water molecules. However, the most frequently applied software solutions require substantial computational resources. In addition, none of the existing methods has been rigorously evaluated on the basis of a large number of diverse protein complexes. Therefore, we present a novel method for placing water molecules, called WarPP, based on interaction geometries previously derived from protein crystal structures. Using a large, previously compiled, high-quality validation set of almost 1500 protein-ligand complexes containing almost 20 000 crystallographically observed water molecules in their active sites, we validated our placement strategy. We correctly placed 80% of the water molecules within 1.0 Šof a crystallographically observed one.


Subject(s)
Proteins/chemistry , Water/chemistry , Binding Sites , Databases, Protein , Ligands , Models, Molecular , Protein Conformation , Thermodynamics
2.
J Chem Inf Model ; 57(9): 2132-2142, 2017 09 25.
Article in English | MEDLINE | ID: mdl-28891648

ABSTRACT

Noncovalent interactions play an important role in macromolecular complexes. The assessment of molecular interactions is often based on knowledge derived from statistics on structural data. Within the last years, the available data in the Brookhaven Protein Data Bank has increased dramatically, quantitatively as well as qualitatively. This development allows the derivation of enhanced interaction models and motivates new ways of data analysis. Here, we present a method to facilitate the analysis of noncovalent interactions enabling detailed insights into the nature of molecular interactions. The method is integrated into a highly variable framework enabling the adaption to user-specific requirements. NAOMInova, the user interface for our method, allows the generation of specific statistics with respect to the chemical environment of substructures. The substructures as well as the analyzed set of protein structures can be chosen arbitrarily. Although NAOMInova was primarily made for data exploration in protein-ligand crystal structures, it can be used in combination with any structure collection, for example, analysis of a carbonyl in the neighborhood of an aromatic ring on a set of structures resulting from a MD simulation. Additionally, a filter for different atom attributes can be applied including the experimental support by electron density for single atoms. In this publication, we present the underlying algorithmic techniques of our method and show application examples that demonstrate NAOMInova's ability to support individual analysis of noncovalent interactions in protein structures. NAOMInova is available at http://www.zbh.uni-hamburg.de/naominova .


Subject(s)
Computational Biology/methods , Macromolecular Substances/chemistry , Macromolecular Substances/metabolism , User-Computer Interface , Computer Graphics , Models, Molecular , Molecular Conformation
3.
J Biotechnol ; 261: 207-214, 2017 Nov 10.
Article in English | MEDLINE | ID: mdl-28610996

ABSTRACT

Nowadays, computational approaches are an integral part of life science research. Problems related to interpretation of experimental results, data analysis, or visualization tasks highly benefit from the achievements of the digital era. Simulation methods facilitate predictions of physicochemical properties and can assist in understanding macromolecular phenomena. Here, we will give an overview of the methods developed in our group that aim at supporting researchers from all life science areas. Based on state-of-the-art approaches from structural bioinformatics and cheminformatics, we provide software covering a wide range of research questions. Our all-in-one web service platform ProteinsPlus (http://proteins.plus) offers solutions for pocket and druggability prediction, hydrogen placement, structure quality assessment, ensemble generation, protein-protein interaction classification, and 2D-interaction visualization. Additionally, we provide a software package that contains tools targeting cheminformatics problems like file format conversion, molecule data set processing, SMARTS editing, fragment space enumeration, and ligand-based virtual screening. Furthermore, it also includes structural bioinformatics solutions for inverse screening, binding site alignment, and searching interaction patterns across structure libraries. The software package is available at http://software.zbh.uni-hamburg.de.


Subject(s)
Computational Biology , Internet , Software , Databases, Protein
4.
Nucleic Acids Res ; 45(W1): W337-W343, 2017 07 03.
Article in English | MEDLINE | ID: mdl-28472372

ABSTRACT

With currently more than 126 000 publicly available structures and an increasing growth rate, the Protein Data Bank constitutes a rich data source for structure-driven research in fields like drug discovery, crop science and biotechnology in general. Typical workflows in these areas involve manifold computational tools for the analysis and prediction of molecular functions. Here, we present the ProteinsPlus web server that offers a unified easy-to-use interface to a broad range of tools for the early phase of structure-based molecular modeling. This includes solutions for commonly required pre-processing tasks like structure quality assessment (EDIA), hydrogen placement (Protoss) and the search for alternative conformations (SIENA). Beyond that, it also addresses frequent problems as the generation of 2D-interaction diagrams (PoseView), protein-protein interface classification (HyPPI) as well as automatic pocket detection and druggablity assessment (DoGSiteScorer). The unified ProteinsPlus interface covering all featured approaches provides various facilities for intuitive input and result visualization, case-specific parameterization and download options for further processing. Moreover, its generalized workflow allows the user a quick familiarization with the different tools. ProteinsPlus also stores the calculated results temporarily for future request and thus facilitates convenient result communication and re-access. The server is freely available at http://proteins.plus.


Subject(s)
Protein Conformation , Software , Binding Sites , Hydrogen/chemistry , Internet , Ligands , Models, Molecular , Protein Interaction Mapping , Proteins/chemistry
5.
Proteins ; 85(8): 1550-1566, 2017 Aug.
Article in English | MEDLINE | ID: mdl-28486771

ABSTRACT

Reliable computational prediction of protein side chain conformations and the energetic impact of amino acid mutations are the key aspects for the optimization of biotechnologically relevant enzymatic reactions using structure-based design. By improving the protein stability, higher yields can be achieved. In addition, tuning the substrate selectivity of an enzymatic reaction by directed mutagenesis can lead to higher turnover rates. This work presents a novel approach to predict the conformation of a side chain mutation along with the energetic effect on the protein structure. The HYDE scoring concept applied here describes the molecular interactions primarily by evaluating the effect of dehydration and hydrogen bonding on molecular structures in aqueous solution. Here, we evaluate its capability of side-chain conformation prediction in classic remutation experiments. Furthermore, we present a new data set for evaluating "cross-mutations," a new experiment that resembles real-world application scenarios more closely. This data set consists of protein pairs with up to five point mutations. Thus, structural changes are attributed to point mutations only. In the cross-mutation experiment, the original protein structure is mutated with the aim to predict the structure of the side chain as in the paired mutated structure. The comparison of side chain conformation prediction ("remutation") showed that the performance of HYDEprotein is qualitatively comparable to state-of-the art methods. The ability of HYDEprotein to predict the energetic effect of a mutation is evaluated in the third experiment. Herein, the effect on protein stability is predicted correctly in 70% of the evaluated cases. Proteins 2017; 85:1550-1566. © 2017 Wiley Periodicals, Inc.


Subject(s)
Amino Acids/chemistry , Point Mutation , Water/chemistry , beta-Glucosidase/chemistry , Amino Acid Substitution , Amino Acids/genetics , Desiccation , Humans , Hydrogen Bonding , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Stability , Software , Solutions , Structure-Activity Relationship , Thermodynamics , beta-Glucosidase/genetics
6.
J Med Chem ; 60(10): 4245-4257, 2017 05 25.
Article in English | MEDLINE | ID: mdl-28497966

ABSTRACT

Protein-ligand interactions are the fundamental basis for molecular design in pharmaceutical research, biocatalysis, and agrochemical development. Especially hydrogen bonds are known to have special geometric requirements and therefore deserve a detailed analysis. In modeling approaches a more general description of hydrogen bond geometries, using distance and directionality, is applied. A first study of their geometries was performed based on 15 protein structures in 1982. Currently there are about 95 000 protein-ligand structures available in the PDB, providing a solid foundation for a new large-scale statistical analysis. Here, we report a comprehensive investigation of geometric and functional properties of hydrogen bonds. Out of 22 defined functional groups, eight are fully in accordance with theoretical predictions while 14 show variations from expected values. On the basis of these results, we derived interaction geometries to improve current computational models. It is expected that these observations will be useful in designing new chemical structures for biological applications.


Subject(s)
Proteins/metabolism , Animals , Databases, Protein , Drug Discovery , Humans , Hydrogen Bonding , Ligands , Molecular Docking Simulation , Proteins/chemistry
7.
J Chem Inf Model ; 57(2): 148-158, 2017 02 27.
Article in English | MEDLINE | ID: mdl-28128948

ABSTRACT

Comparison of three-dimensional interaction patterns in large collections of protein-ligand interfaces is a key element for understanding protein-ligand interactions and supports various steps in the structure-based drug design process. Different methods exist that provide query systems to search for geometrical patterns in protein-ligand complexes. However, these tools do not meet all of the requirements, which are high query variability, an adjustable search set, and high retrieval speed. Here we present a new tool named PELIKAN that is able to search for a variety of geometrical queries in large protein structure collections in a reasonably short time. The data are stored in an SQLite database that can easily be constructed from any set of protein-ligand complexes. We present different test queries demonstrating the performance of the PELIKAN approach. Furthermore, two application scenarios show the usefulness of PELIKAN in structure-based design endeavors.


Subject(s)
Data Mining/methods , Proteins/metabolism , Algorithms , Ligands , Models, Molecular , Protein Binding , Protein Conformation , Proteins/chemistry , Time Factors
9.
Mol Inform ; 35(11-12): 593-598, 2016 12.
Article in English | MEDLINE | ID: mdl-27870245

ABSTRACT

Structure-based drug design starts with the collection, preparation, and initial analysis of protein structures. With more than 115,000 structures publically available in the Protein Data Bank (PDB), fully automated processes reliably performing these important preprocessing steps are needed. Several tools are available for these tasks, however, most of them do not address the special needs of scientists interested in protein-ligand interactions. In this paper, we summarize our research activities towards an automated processing pipeline from raw PDB data towards ready-to-use protein binding site ensembles. Starting from a single protein structure, the pipeline covers the following phases: Extracting structurally related binding sites from the PDB, aligning disconnected binding site sequences, resolving tautomeric forms and protonation, orienting hydrogens and flippable side-chains, structurally aligning the multitude of binding sites, and performing a reasonable reduction of ensemble structures. The pipeline, named SIENA, creates protein-structural ensembles for the analysis of protein flexibility, molecular design efforts like docking or de novo design within seconds. For the first time, we are able to process the whole PDB in order to create a large collection of protein binding site ensembles. SIENA is available as part of the ZBH ProteinsPlus webserver under http://proteinsplus.zbh.uni-hamburg.de.


Subject(s)
Binding Sites/physiology , Protein Binding/physiology , Proteins/chemistry , Proteins/metabolism , Databases, Protein , Drug Design , Ligands , Software
10.
J Comput Aided Mol Des ; 30(8): 583-94, 2016 08.
Article in English | MEDLINE | ID: mdl-27565795

ABSTRACT

Ligand-based virtual screening is a well established method to find new lead molecules in todays drug discovery process. In order to be applicable in day to day practice, such methods have to face multiple challenges. The most important part is the reliability of the results, which can be shown and compared in retrospective studies. Furthermore, in the case of 3D methods, they need to provide biologically relevant molecular alignments of the ligands, that can be further investigated by a medicinal chemist. Last but not least, they have to be able to screen large databases in reasonable time. Many algorithms for ligand-based virtual screening have been proposed in the past, most of them based on pairwise comparisons. Here, a new method is introduced called mRAISE. Based on structural alignments, it uses a descriptor-based bitmap search engine (RAISE) to achieve efficiency. Alignments created on the fly by the search engine get evaluated with an independent shape-based scoring function also used for ranking of compounds. The correct ranking as well as the alignment quality of the method are evaluated and compared to other state of the art methods. On the commonly used Directory of Useful Decoys dataset mRAISE achieves an average area under the ROC curve of 0.76, an average enrichment factor at 1 % of 20.2 and an average hit rate at 1 % of 55.5. With these results, mRAISE is always among the top performing methods with available data for comparison. To access the quality of the alignments calculated by ligand-based virtual screening methods, we introduce a new dataset containing 180 prealigned ligands for 11 diverse targets. Within the top ten ranked conformations, the alignment closest to X-ray structure calculated with mRAISE has a root-mean-square deviation of less than 2.0 Å for 80.8 % of alignment pairs and achieves a median of less than 2.0 Å for eight of the 11 cases. The dataset used to rate the quality of the calculated alignments is freely available at http://www.zbh.uni-hamburg.de/mraise-dataset.html . The table of all PDB codes contained in the ensembles can be found in the supplementary material. The software tool mRAISE is freely available for evaluation purposes and academic use (see http://www.zbh.uni-hamburg.de/raise ).


Subject(s)
Drug Discovery , Software , Algorithms , Binding Sites , Databases, Chemical , Databases, Protein , Drug Discovery/methods , Humans , Ligands , Models, Molecular , Proteins/metabolism
11.
J Chem Inf Model ; 56(6): 1105-11, 2016 06 27.
Article in English | MEDLINE | ID: mdl-27227368

ABSTRACT

The accurate handling of different chemical file formats and the consistent conversion between them play important roles for calculations in complex cheminformatics workflows. Working with different cheminformatic tools often makes the conversion between file formats a mandatory step. Such a conversion might become a difficult task in cases where the information content substantially differs. This paper describes UNICON, an easy-to-use software tool for this task. The functionality of UNICON ranges from file conversion between standard formats SDF, MOL2, SMILES, PDB, and PDBx/mmCIF via the generation of 2D structure coordinates and 3D structures to the enumeration of tautomeric forms, protonation states, and conformer ensembles. For this purpose, UNICON bundles the key elements of the previously described NAOMI library in a single, easy-to-use command line tool.


Subject(s)
Informatics/methods , Small Molecule Libraries/chemistry , Software , Isomerism , Models, Molecular , Molecular Conformation , Protons
12.
J Chem Inf Model ; 56(1): 248-59, 2016 Jan 25.
Article in English | MEDLINE | ID: mdl-26759067

ABSTRACT

Structural flexibility of proteins has an important influence on molecular recognition and enzymatic function. In modeling, structure ensembles are therefore often applied as a valuable source of alternative protein conformations. However, their usage is often complicated by structural artifacts and inconsistent data annotation. Here, we present SIENA, a new computational approach for the automated assembly and preprocessing of protein binding site ensembles. Starting with an arbitrarily defined binding site in a single protein structure, SIENA searches for alternative conformations of the same or sequentially closely related binding sites. The method is based on an indexed database for identifying perfect k-mer matches and a recently published algorithm for the alignment of protein binding site conformations. Furthermore, SIENA provides a new algorithm for the interaction-based selection of binding site conformations which aims at covering all known ligand-binding geometries. Various experiments highlight that SIENA is able to generate comprehensive and well selected binding site ensembles improving the compatibility to both known and unconsidered ligand molecules. Starting with the whole PDB as data source, the computation time of the whole ensemble generation takes only a few seconds. SIENA is available via a Web service at www.zbh.uni-hamburg.de/siena .


Subject(s)
Computational Biology/methods , Proteins/chemistry , Proteins/metabolism , Algorithms , Binding Sites , Data Mining , Databases, Protein , Models, Molecular , Protein Binding , Protein Conformation , Software , Substrate Specificity
14.
J Chem Inf Model ; 55(8): 1535-46, 2015 Aug 24.
Article in English | MEDLINE | ID: mdl-26268674

ABSTRACT

The classification of molecules with respect to their inhibiting, activating, or toxicological potential constitutes a central aspect in the field of cheminformatics. Often, a discriminative feature is needed to distinguish two different molecule sets. Besides physicochemical properties, substructures and chemical patterns belong to the descriptors most frequently applied for this purpose. As a commonly used example of this descriptor class, SMARTS strings represent a powerful concept for the representation and processing of abstract chemical patterns. While their usage facilitates a convenient way to apply previously derived classification rules on new molecule sets, the manual generation of useful SMARTS patterns remains a complex and time-consuming process. Here, we introduce SMARTSminer, a new algorithm for the automatic derivation of discriminative SMARTS patterns from preclassified molecule sets. Based on a specially adapted subgraph mining algorithm, SMARTSminer identifies structural features that are frequent in only one of the given molecule classes. In comparison to elemental substructures, it also supports the consideration of general and specific SMARTS features. Furthermore, SMARTSminer is integrated into an interactive pattern editor named SMARTSeditor. This allows for an intuitive visualization on the basis of the SMARTSviewer concept as well as interactive adaption and further improvement of the generated patterns. Additionally, a new molecular matching feature provides an immediate feedback on a pattern's matching behavior across the molecule sets. We demonstrate the utility of the SMARTSminer functionality and its integration into the SMARTSeditor software in several different classification scenarios.


Subject(s)
Algorithms , Drug Design , Pattern Recognition, Automated/methods , Software , Cyclooxygenase 1/metabolism , Cyclooxygenase 2/metabolism , Humans , Ligands , Models, Molecular , Molecular Structure , Phosphotransferases/metabolism , Protein Binding
15.
J Chem Inf Model ; 55(8): 1747-56, 2015 Aug 24.
Article in English | MEDLINE | ID: mdl-26098831

ABSTRACT

The usage of conformational ensembles constitutes a widespread technique for the consideration of protein flexibility in computational biology. When experimental structures are applied for this purpose, alignment techniques are usually required in dealing with structural deviations and annotation inconsistencies. Moreover, many application scenarios focus on protein ligand binding sites. Here, we introduce our new alignment algorithm ASCONA that has been specially geared to the problem of aligning multiple conformations of sequentially similar binding sites. Intense efforts have been directed to an accurate detection of highly flexible backbone deviations, multiple binding site matches within a single structure, and a reliable, but at the same time highly efficient, search algorithm. In contrast, most available alignment methods rather target other issues, e.g., the global alignment of distantly related proteins that share structurally conserved regions. For conformational ensembles, this might not only result in an overhead of computation time but could also affect the achieved accuracy, especially for more complicated cases as highly flexible proteins. ASCONA was evaluated on a test set containing 1107 structures of 65 diverse proteins. In all cases, ASCONA was able to correctly align the binding site at an average alignment computation time of 4 ms per target. Furthermore, no false positive matches were observed when searching the same query sites in the structures of other proteins. ASCONA proved to cope with highly deviating backbone structures and to tolerate structural gaps and moderate mutation rates. ASCONA is available free of charge for academic use at http://www.zbh.uni-hamburg.de/ascona .


Subject(s)
Algorithms , Proteins/chemistry , Amino Acid Sequence , Animals , Binding Sites , Databases, Protein , Humans , Ligands , Molecular Docking Simulation , Protein Binding , Protein Conformation , Proteins/metabolism
16.
J Chem Inf Model ; 54(6): 1676-86, 2014 Jun 23.
Article in English | MEDLINE | ID: mdl-24851945

ABSTRACT

Computational target prediction for bioactive compounds is a promising field in assessing off-target effects. Structure-based methods not only predict off-targets, but, simultaneously, binding modes, which are essential for understanding the mode of action and rationally designing selective compounds. Here, we highlight the current open challenges of computational target prediction methods based on protein structures and show why inverse screening rather than sequential pairwise protein-ligand docking methods are needed. A new inverse screening method based on triangle descriptors is introduced: iRAISE (inverse Rapid Index-based Screening Engine). A Scoring Cascade considering the reference ligand as well as the ligand and active site coverage is applied to overcome interprotein scoring noise of common protein-ligand scoring functions. Furthermore, a statistical evaluation of a score cutoff for each individual protein pocket is used. The ranking and binding mode prediction capabilities are evaluated on different datasets and compared to inverse docking and pharmacophore-based methods. On the Astex Diverse Set, iRAISE ranks more than 35% of the targets to the first position and predicts more than 80% of the binding modes with a root-mean-square deviation (RMSD) accuracy of <2.0 Å. With a median computing time of 5 s per protein, large amounts of protein structures can be screened rapidly. On a test set with 7915 protein structures and 117 query ligands, iRAISE predicts the first true positive in a ranked list among the top eight ranks (median), i.e., among 0.28% of the targets.


Subject(s)
Drug Design , Proteins/chemistry , Proteins/metabolism , Algorithms , Binding Sites , Databases, Protein , Ligands , Molecular Docking Simulation , Protein Binding , Protein Conformation , Software
17.
J Cheminform ; 6: 12, 2014.
Article in English | MEDLINE | ID: mdl-24694216

ABSTRACT

The calculation of hydrogen positions is a common preprocessing step when working with crystal structures of protein-ligand complexes. An explicit description of hydrogen atoms is generally needed in order to analyze the binding mode of particular ligands or to calculate the associated binding energies. Due to the large number of degrees of freedom resulting from different chemical moieties and the high degree of mutual dependence this problem is anything but trivial. In addition to an efficient algorithm to take care of the complexity resulting from complicated hydrogen bonding networks, a robust chemical model is needed to describe effects such as tautomerism and ionization consistently. We present a novel method for the placement of hydrogen coordinates in protein-ligand complexes which takes tautomers and protonation states of both protein and ligand into account. Our method generates the most probable hydrogen positions on the basis of an optimal hydrogen bonding network using an empirical scoring function. The high quality of our results could be verified by comparison to the manually adjusted Astex diverse set and a remarkably low rate of undesirable hydrogen contacts compared to other tools.

SELECTION OF CITATIONS
SEARCH DETAIL
...