Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
J Chem Inf Model ; 64(7): 2612-2623, 2024 Apr 08.
Article in English | MEDLINE | ID: mdl-38157481

ABSTRACT

Structure-based drug discovery is a process for both hit finding and optimization that relies on a validated three-dimensional model of a target biomolecule, used to rationalize the structure-function relationship for this particular target. An ultralarge virtual screening approach has emerged recently for rapid discovery of high-affinity hit compounds, but it requires substantial computational resources. This study shows that active learning with simple linear regression models can accelerate virtual screening, retrieving up to 90% of the top-1% of the docking hit list after docking just 10% of the ligands. The results demonstrate that it is unnecessary to use complex models, such as deep learning approaches, to predict the imprecise results of ligand docking with a low sampling depth. Furthermore, we explore active learning meta-parameters and find that constant batch size models with a simple ensembling method provide the best ligand retrieval rate. Finally, our approach is validated on the ultralarge size virtual screening data set, retrieving 70% of the top-0.05% of ligands after screening only 2% of the library. Altogether, this work provides a computationally accessible approach for accelerated virtual screening that can serve as a blueprint for the future design of low-compute agents for exploration of the chemical space via large-scale accelerated docking. With recent breakthroughs in protein structure prediction, this method can significantly increase accessibility for the academic community and aid in the rapid discovery of high-affinity hit compounds for various targets.


Subject(s)
Drug Discovery , Protein Binding , Molecular Docking Simulation , Ligands
2.
Bioinformatics ; 37(16): 2332-2339, 2021 Aug 25.
Article in English | MEDLINE | ID: mdl-33620450

ABSTRACT

MOTIVATION: Effective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance. RESULTS: For the first time, we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows us to efficiently introduce both convolution and pooling operations and train the network in an end-to-end fashion without precomputed descriptors. The resultant model, VoroCNN, predicts local qualities of 3D protein folds. The prediction results are competitive to state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in recognition of protein binding interfaces. AVAILABILITY AND IMPLEMENTATION: The model, data and evaluation tests are available at https://team.inria.fr/nano-d/software/vorocnn/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

3.
Bioinformatics ; 37(7): 943-950, 2021 05 17.
Article in English | MEDLINE | ID: mdl-32840574

ABSTRACT

MOTIVATION: Despite the progress made in studying protein-ligand interactions and the widespread application of docking and affinity prediction tools, improving their precision and efficiency still remains a challenge. Computational approaches based on the scoring of docking conformations with statistical potentials constitute a popular alternative to more accurate but costly physics-based thermodynamic sampling methods. In this context, a minimalist and fast sidechain-free knowledge-based potential with a high docking and screening power can be very useful when screening a big number of putative docking conformations. RESULTS: Here, we present a novel coarse-grained potential defined by a 3D joint probability distribution function that only depends on the pairwise orientation and position between protein backbone and ligand atoms. Despite its extreme simplicity, our approach yields very competitive results with the state-of-the-art scoring functions, especially in docking and screening tasks. For example, we observed a twofold improvement in the median 5% enrichment factor on the DUD-E benchmark compared to Autodock Vina results. Moreover, our results prove that a coarse sidechain-free potential is sufficient for a very successful docking pose prediction. AVAILABILITYAND IMPLEMENTATION: The standalone version of KORP-PL with the corresponding tests and benchmarks are available at https://team.inria.fr/nano-d/korp-pl/ and https://chaconlab.org/modeling/korp-pl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Proteins , Software , Knowledge Bases , Ligands , Molecular Docking Simulation , Protein Binding , Proteins/metabolism
4.
Int J Mol Sci ; 21(20)2020 Oct 16.
Article in English | MEDLINE | ID: mdl-33081390

ABSTRACT

Spreading of the multidrug-resistant (MDR) strains of the one of the most harmful pathogen Mycobacterium tuberculosis (Mtb) generates the need for new effective drugs. SQ109 showed activity against resistant Mtb and already advanced to Phase II/III clinical trials. Fast SQ109 degradation is attributed to the human liver Cytochrome P450s (CYPs). However, no information is available about interactions of the drug with Mtb CYPs. Here, we show that Mtb CYP124, previously assigned as a methyl-branched lipid monooxygenase, binds and hydroxylates SQ109 in vitro. A 1.25 Å-resolution crystal structure of the CYP124-SQ109 complex unambiguously shows two conformations of the drug, both positioned for hydroxylation of the ω-methyl group in the trans position. The hydroxylated SQ109 presumably forms stabilizing H-bonds with its target, Mycobacterial membrane protein Large 3 (MmpL3). We anticipate that Mtb CYPs could function as analogs of drug-metabolizing human CYPs affecting pharmacokinetics and pharmacodynamics of antitubercular (anti-TB) drugs.


Subject(s)
Adamantane/analogs & derivatives , Antitubercular Agents/chemistry , Cytochrome P-450 Enzyme System/chemistry , Ethylenediamines/chemistry , Molecular Docking Simulation , Mycobacterium tuberculosis/enzymology , Adamantane/chemistry , Adamantane/pharmacology , Antitubercular Agents/pharmacology , Binding Sites , Cytochrome P-450 Enzyme System/metabolism , Ethylenediamines/pharmacology , Hydroxylation , Protein Binding
5.
J Comput Aided Mol Des ; 34(2): 191-200, 2020 02.
Article in English | MEDLINE | ID: mdl-31784861

ABSTRACT

The D3R Grand Challenge 4 provided a brilliant opportunity to test macrocyclic docking protocols on a diverse high-quality experimental data. We participated in both pose and affinity prediction exercises. Overall, we aimed to use an automated structure-based docking pipeline built around a set of tools developed in our team. This exercise again demonstrated a crucial importance of the correct local ligand geometry for the overall success of docking. Starting from the second part of the pose prediction stage, we developed a stable pipeline for sampling macrocycle conformers. This resulted in the subangstrom average precision of our pose predictions. In the affinity prediction exercise we obtained average results. However, we could improve these when using docking poses submitted by the best predictors. Our docking tools including the Convex-PL scoring function are available at https://team.inria.fr/nano-d/software/.


Subject(s)
Drug Design , Macrocyclic Compounds/pharmacology , Molecular Docking Simulation , Proteins/metabolism , Binding Sites , Databases, Protein , Humans , Ligands , Macrocyclic Compounds/chemistry , Protein Binding , Protein Conformation , Proteins/chemistry , Software
6.
Proteins ; 87(12): 1200-1221, 2019 12.
Article in English | MEDLINE | ID: mdl-31612567

ABSTRACT

We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.


Subject(s)
Computational Biology , Protein Conformation , Proteins/ultrastructure , Software , Algorithms , Binding Sites/genetics , Databases, Protein , Models, Molecular , Protein Binding/genetics , Protein Interaction Mapping , Proteins/chemistry , Proteins/genetics , Structural Homology, Protein
7.
J Comput Aided Mol Des ; 32(1): 151-162, 2018 01.
Article in English | MEDLINE | ID: mdl-28913782

ABSTRACT

The 2016 D3R Grand Challenge 2 provided an opportunity to test multiple protein-ligand docking protocols on a set of ligands bound to farnesoid X receptor that has many available experimental structures. We participated in the Stage 1 of the Challenge devoted to the docking pose predictions, with the mean RMSD value of our submission poses of 2.9 Å. Here we present a thorough analysis of our docking predictions made with AutoDock Vina and the Convex-PL rescoring potential by reproducing our submission protocol and running a series of additional molecular docking experiments. We conclude that a correct receptor structure, or more precisely, the structure of the binding pocket, plays the crucial role in the success of our docking studies. We have also noticed the important role of a local ligand geometry, which seems to be not well discussed in literature. We succeed to improve our results up to the mean RMSD value of 2.15-2.33 Å  dependent on the models of the ligands, if docking these to all available homologous receptors. Overall, for docking of ligands of diverse chemical series we suggest to perform docking of each of the ligands to a set of multiple receptors that are homologous to the target.


Subject(s)
Drug Design , Drug Discovery , Molecular Docking Simulation , Receptors, Cytoplasmic and Nuclear/metabolism , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Binding Sites , Computer-Aided Design , Crystallography, X-Ray , Databases, Protein , Humans , Ligands , Protein Binding , Protein Conformation , Receptors, Cytoplasmic and Nuclear/chemistry , Software
8.
J Comput Aided Mol Des ; 31(10): 943-958, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28921375

ABSTRACT

We present a novel optimization approach to train a free-shape distance-dependent protein-ligand scoring function called Convex-PL. We do not impose any functional form of the scoring function. Instead, we decompose it into a polynomial basis and deduce the expansion coefficients from the structural knowledge base using a convex formulation of the optimization problem. Also, for the training set we do not generate false poses with molecular docking packages, but use constant RMSD rigid-body deformations of the ligands inside the binding pockets. This allows the obtained scoring function to be generally applicable to scoring of structural ensembles generated with different docking methods. We assess the Convex-PL scoring function using data from D3R Grand Challenge 2 submissions and the docking test of the CASF 2013 study. We demonstrate that our results outperform the other 20 methods previously assessed in CASF 2013. The method is available at http://team.inria.fr/nano-d/software/Convex-PL/ .


Subject(s)
Knowledge Bases , Machine Learning , Molecular Docking Simulation , Proteins/chemistry , Algorithms , Binding Sites , Databases, Protein , Drug Design , Humans , Ligands , Molecular Structure , Protein Binding , Protein Conformation , Structure-Activity Relationship
9.
J Comput Aided Mol Des ; 30(9): 791-804, 2016 09.
Article in English | MEDLINE | ID: mdl-27718029

ABSTRACT

The 2015 D3R Grand Challenge provided an opportunity to test our new model for the binding free energy of small molecules, as well as to assess our protocol to predict binding poses for protein-ligand complexes. Our pose predictions were ranked 3-9 for the HSP90 dataset, depending on the assessment metric. For the MAP4K dataset the ranks are very dispersed and equal to 2-35, depending on the assessment metric, which does not provide any insight into the accuracy of the method. The main success of our pose prediction protocol was the re-scoring stage using the recently developed Convex-PL potential. We make a thorough analysis of our docking predictions made with AutoDock Vina and discuss the effect of the choice of rigid receptor templates, the number of flexible residues in the binding pocket, the binding pocket size, and the benefits of re-scoring. However, the main challenge was to predict experimentally determined binding affinities for two blind test sets. Our affinity prediction model consisted of two terms, a pairwise-additive enthalpy, and a non pairwise-additive entropy. We trained the free parameters of the model with a regularized regression using affinity and structural data from the PDBBind database. Our model performed very well on the training set, however, failed on the two test sets. We explain the drawback and pitfalls of our model, in particular in terms of relative coverage of the test set by the training set and missed dynamical properties from crystal structures, and discuss different routes to improve it.


Subject(s)
HSP90 Heat-Shock Proteins/chemistry , Molecular Docking Simulation/methods , Binding Sites , Databases, Protein , Drug Design , Entropy , Humans , Ligands , Prospective Studies , Protein Binding , Protein Conformation , Regression Analysis , Structure-Activity Relationship , Thermodynamics
10.
J Chem Inf Model ; 56(8): 1410-9, 2016 08 22.
Article in English | MEDLINE | ID: mdl-27405533

ABSTRACT

Here we address the problem of the assignment of atom types and bond orders in low molecular weight compounds. For this purpose, we have developed a prediction model based on nonlinear Support Vector Machines (SVM), implemented in a KNOwledge-Driven Ligand Extractor called Knodle, a software library for the recognition of atomic types, hybridization states, and bond orders in the structures of small molecules. We trained the model using an excessive amount of structural data collected from the PDBbindCN database. Accuracy of the results and the running time of our method is comparable with other popular methods, such as NAOMI, fconv, and I-interpret. On the popular Labute's benchmark set consisting of 179 protein-ligand complexes, Knodle makes five to six perception errors, NAOMI makes seven errors, I-interpret makes nine errors, and fconv makes 13 errors. On a larger set of 3,000 protein-ligand structures collected from the PDBBindCN general data set (v2014), Knodle and NAOMI have a comparable accuracy of approximately 3.9% and 4.7% of errors, I-interpret made 6.0% of errors, while fconv produced approximately 12.8% of errors. On a more general set of 332,974 entries collected from the Ligand Expo database, Knodle made 4.5% of errors. Overall, our study demonstrates the efficiency and robustness of nonlinear SVM in structure perception tasks. Knodle is available at https://team.inria.fr/nano-d/software/Knodle .


Subject(s)
Informatics/methods , Organic Chemicals/chemistry , Software , Support Vector Machine , Algorithms , Automation , Ligands , Molecular Weight
SELECTION OF CITATIONS
SEARCH DETAIL
...