Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-38739515

ABSTRACT

Inductive bias in machine learning (ML) is the set of assumptions describing how a model makes predictions. Different ML-based methods for protein-ligand binding affinity (PLA) prediction have different inductive biases, leading to different levels of generalization capability and interpretability. Intuitively, the inductive bias of an ML-based model for PLA prediction should fit in with biological mechanisms relevant for binding to achieve good predictions with meaningful reasons. To this end, we propose an interaction-based inductive bias to restrict neural networks to functions relevant for binding with two assumptions: (1) A protein-ligand complex can be naturally expressed as a heterogeneous graph with covalent and non-covalent interactions; (2) The predicted PLA is the sum of pairwise atom-atom affinities determined by non-covalent interactions. The interaction-based inductive bias is embodied by an explainable heterogeneous interaction graph neural network (EHIGN) for explicitly modeling pairwise atom-atom interactions to predict PLA from 3D structures. Extensive experiments demonstrate that EHIGN achieves better generalization capability than other state-of-the-art ML-based baselines in PLA prediction and structure-based virtual screening. More importantly, comprehensive analyses of distance-affinity, pose-affinity, and substructure-affinity relations suggest that the interaction-based inductive bias can guide the model to learn atomic interactions that are consistent with physical reality. As a case study to demonstrate practical usefulness, our method is tested for predicting the efficacy of Nirmatrelvir against SARS-CoV-2 variants. EHIGN successfully recognizes the changes in the efficacy of Nirmatrelvir for different SARS-CoV-2 variants with meaningful reasons.

2.
Nat Commun ; 14(1): 3009, 2023 May 25.
Article in English | MEDLINE | ID: mdl-37230985

ABSTRACT

Retrosynthesis planning, the process of identifying a set of available reactions to synthesize the target molecules, remains a major challenge in organic synthesis. Recently, computer-aided synthesis planning has gained renewed interest and various retrosynthesis prediction algorithms based on deep learning have been proposed. However, most existing methods are limited to the applicability and interpretability of model predictions, and further improvement of predictive accuracy to a more practical level is still required. In this work, inspired by the arrow-pushing formalism in chemical reaction mechanisms, we present an end-to-end architecture for retrosynthesis prediction called Graph2Edits. Specifically, Graph2Edits is based on graph neural network to predict the edits of the product graph in an auto-regressive manner, and sequentially generates transformation intermediates and final reactants according to the predicted edits sequence. This strategy combines the two-stage processes of semi-template-based methods into one-pot learning, improving the applicability in some complicated reactions, and also making its predictions more interpretable. Evaluated on the standard benchmark dataset USPTO-50k, our model achieves the state-of-the-art performance for semi-template-based retrosynthesis with a promising 55.1% top-1 accuracy.

3.
Article in English | MEDLINE | ID: mdl-37028032

ABSTRACT

Finding candidate molecules with favorable pharmacological activity, low toxicity, and proper pharmacokinetic properties is an important task in drug discovery. Deep neural networks have made impressive progress in accelerating and improving drug discovery. However, these techniques rely on a large amount of label data to form accurate predictions of molecular properties. At each stage of the drug discovery pipeline, usually, only a few biological data of candidate molecules and derivatives are available, indicating that the application of deep neural networks for low-data drug discovery is still a formidable challenge. Here, we propose a meta learning architecture with graph attention network, Meta-GAT, to predict molecular properties in low-data drug discovery. The GAT captures the local effects of atomic groups at the atom level through the triple attentional mechanism and implicitly captures the interactions between different atomic groups at the molecular level. GAT is used to perceive molecular chemical environment and connectivity, thereby effectively reducing sample complexity. Meta-GAT further develops a meta learning strategy based on bilevel optimization, which transfers meta knowledge from other attribute prediction tasks to low-data target tasks. In summary, our work demonstrates how meta learning can reduce the amount of data required to make meaningful predictions of molecules in low-data scenarios. Meta learning is likely to become the new learning paradigm in low-data drug discovery. The source code is publicly available at: https://github.com/lol88/Meta-GAT.

4.
Article in English | MEDLINE | ID: mdl-37022856

ABSTRACT

Drug-drug interactions (DDIs) trigger unexpected pharmacological effects in vivo, often with unknown causal mechanisms. Deep learning methods have been developed to better understand DDI. However, learning domain-invariant representations for DDI remains a challenge. Generalizable DDI predictions are closer to reality than source domain predictions. For existing methods, it is difficult to achieve out-of-distribution (OOD) predictions. In this article, focusing on substructure interaction, we propose DSIL-DDI, a pluggable substructure interaction module that can learn domain-invariant representations of DDIs from source domain. We evaluate DSIL-DDI on three scenarios: the transductive setting (all drugs in test set appear in training set), the inductive setting (test set contains new drugs that were not present in training set), and OOD generalization setting (training set and test set belong to two different datasets). The results demonstrate that DSIL-DDI improve the generalization and interpretability of DDI prediction modeling and provides valuable insights for OOD DDI predictions. DSIL-DDI can help doctors ensuring the safety of drug administration and reducing the harm caused by drug abuse.

5.
J Phys Chem Lett ; 14(8): 2020-2033, 2023 Mar 02.
Article in English | MEDLINE | ID: mdl-36794930

ABSTRACT

Predicting protein-ligand binding affinities (PLAs) is a core problem in drug discovery. Recent advances have shown great potential in applying machine learning (ML) for PLA prediction. However, most of them omit the 3D structures of complexes and physical interactions between proteins and ligands, which are considered essential to understanding the binding mechanism. This paper proposes a geometric interaction graph neural network (GIGN) that incorporates 3D structures and physical interactions for predicting protein-ligand binding affinities. Specifically, we design a heterogeneous interaction layer that unifies covalent and noncovalent interactions into the message passing phase to learn node representations more effectively. The heterogeneous interaction layer also follows fundamental biological laws, including invariance to translations and rotations of the complexes, thus avoiding expensive data augmentation strategies. GIGN achieves state-of-the-art performance on three external test sets. Moreover, by visualizing learned representations of protein-ligand complexes, we show that the predictions of GIGN are biologically meaningful.


Subject(s)
Neural Networks, Computer , Proteins , Ligands , Protein Binding , Proteins/chemistry , Machine Learning
6.
Chem Sci ; 13(29): 8693-8703, 2022 Jul 29.
Article in English | MEDLINE | ID: mdl-35974769

ABSTRACT

Drug-drug interactions (DDIs) can trigger unexpected pharmacological effects on the body, and the causal mechanisms are often unknown. Graph neural networks (GNNs) have been developed to better understand DDIs. However, identifying key substructures that contribute most to the DDI prediction is a challenge for GNNs. In this study, we presented a substructure-aware graph neural network, a message passing neural network equipped with a novel substructure attention mechanism and a substructure-substructure interaction module (SSIM) for DDI prediction (SA-DDI). Specifically, the substructure attention was designed to capture size- and shape-adaptive substructures based on the chemical intuition that the sizes and shapes are often irregular for functional groups in molecules. DDIs are fundamentally caused by chemical substructure interactions. Thus, the SSIM was used to model the substructure-substructure interactions by highlighting important substructures while de-emphasizing the minor ones for DDI prediction. We evaluated our approach in two real-world datasets and compared the proposed method with the state-of-the-art DDI prediction models. The SA-DDI surpassed other approaches on the two datasets. Moreover, the visual interpretation results showed that the SA-DDI was sensitive to the structure information of drugs and was able to detect the key substructures for DDIs. These advantages demonstrated that the proposed method improved the generalization and interpretation capability of DDI prediction modeling.

7.
Chem Sci ; 13(3): 816-833, 2022 Jan 19.
Article in English | MEDLINE | ID: mdl-35173947

ABSTRACT

Predicting drug-target affinity (DTA) is beneficial for accelerating drug discovery. Graph neural networks (GNNs) have been widely used in DTA prediction. However, existing shallow GNNs are insufficient to capture the global structure of compounds. Besides, the interpretability of the graph-based DTA models highly relies on the graph attention mechanism, which can not reveal the global relationship between each atom of a molecule. In this study, we proposed a deep multiscale graph neural network based on chemical intuition for DTA prediction (MGraphDTA). We introduced a dense connection into the GNN and built a super-deep GNN with 27 graph convolutional layers to capture the local and global structure of the compound simultaneously. We also developed a novel visual explanation method, gradient-weighted affinity activation mapping (Grad-AAM), to analyze a deep learning model from the chemical perspective. We evaluated our approach using seven benchmark datasets and compared the proposed method to the state-of-the-art deep learning (DL) models. MGraphDTA outperforms other DL-based approaches significantly on various datasets. Moreover, we show that Grad-AAM creates explanations that are consistent with pharmacologists, which may help us gain chemical insights directly from data beyond human perception. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of DTA prediction modeling.

8.
Phys Chem Chem Phys ; 24(9): 5383-5393, 2022 Mar 02.
Article in English | MEDLINE | ID: mdl-35169821

ABSTRACT

Predicting quantum mechanical properties (QMPs) is very important for the innovation of material and chemistry science. Multitask deep learning models have been widely used in QMPs prediction. However, existing multitask learning models often train multiple QMPs prediction tasks simultaneously without considering the internal relationships and differences between tasks, which may cause the model to overfit easy tasks. In this study, we first proposed a multiscale dynamic attention graph neural network (MDGNN) for molecular representation learning. The MDGNN was designed in a multitask learning fashion that can solve multiple learning tasks at the same time. We then introduced a dynamic task balancing (DTB) strategy combining task differences and difficulties to reduce overfitting across multiple tasks. Finally, we adopted gradient-weighted class activation mapping (Grad-CAM) to analyze a deep learning model for frontier molecular orbital, highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energy level predictions. We evaluated our approach using two large QMPs datasets and compared the proposed method to the state-of-the-art multitask learning models. The MDGNN outperforms other multitask learning approaches on two datasets. The DTB strategy can further improve the performance of MDGNN significantly. Moreover, we show that Grad-CAM creates explanations that are consistent with the molecular orbitals theory. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of QMPs prediction modeling.


Subject(s)
Deep Learning , Machine Learning , Neural Networks, Computer
9.
Brief Bioinform ; 22(6)2021 11 05.
Article in English | MEDLINE | ID: mdl-34428290

ABSTRACT

With the rapid development of proteomics and the rapid increase of target molecules for drug action, computer-aided drug design (CADD) has become a basic task in drug discovery. One of the key challenges in CADD is molecular representation. High-quality molecular expression with chemical intuition helps to promote many boundary problems of drug discovery. At present, molecular representation still faces several urgent problems, such as the polysemy of substructures and unsmooth information flow between atomic groups. In this research, we propose a deep contextualized Bi-LSTM architecture, Mol2Context-vec, which can integrate different levels of internal states to bring dynamic representations of molecular substructures. And the obtained molecular context representation can capture the interactions between any atomic groups, especially a pair of atomic groups that are topologically distant. Experiments show that Mol2Context-vec achieves state-of-the-art performance on multiple benchmark datasets. In addition, the visual interpretation of Mol2Context-vec is very close to the structural properties of chemical molecules as understood by humans. These advantages indicate that Mol2Context-vec can be used as a reliable and effective tool for molecular expression. Availability: The source code is available for download in https://github.com/lol88/Mol2Context-vec.


Subject(s)
Cheminformatics/methods , Deep Learning , Drug Design/methods , Drug Discovery/methods , Algorithms , Humans , Models, Molecular , Quantum Theory , Structure-Activity Relationship
10.
J Mol Graph Model ; 107: 107965, 2021 09.
Article in English | MEDLINE | ID: mdl-34167067

ABSTRACT

Since the Limk1 is a promising drug target and few inhibitors with good Limk1/ROCK2 selectivity have been reported, discovering potential and selective Limk1 inhibitors with novel scaffolds is becoming an urgent need to develop new treatments for the related diseases. Here, we utilized molecular docking to screen potential compounds of Limk1 from Traditional Chinese Medicine (TCM) database. Meanwhile, we performed a three-dimensional graph convolutional network (3DGCN), based on 3D molecular graph, to predict the inhibitory activity of Limk1 and ROCK2. Compared with the baseline models (RF, GCN and Weave), the 3DGCN achieved higher accuracy and the averaged RMSE values on test sets for Limk1 and ROCK2 were 0.721 and 0.852 respectively. In 3DGCN, above 80% of the test-set molecules from both two datasets were predicted within absolute error of 1.0 and the feature visualization suggested that it could automatically learn relevant structure features including 3D molecular information from a specific task for prediction. Furthermore, molecular dynamics (MD) simulations within 100 ns were employed to verify the stability of ligand-protein complexes and reveal the binding modes of the potential selective lead compounds of Limk1. Finally, integrating docking results, the predicted values by the 3DGCN and the MD analysis, we found that 7549 and 2007_15649 might be the potential and selective inhibitors for Limk1 receptor.


Subject(s)
Molecular Dynamics Simulation , Ligands , Molecular Docking Simulation
11.
J Phys Chem Lett ; 12(17): 4247-4261, 2021 May 06.
Article in English | MEDLINE | ID: mdl-33904745

ABSTRACT

Deep learning (DL) provides opportunities for the identification of drug-target interactions (DTIs). The challenges of applying DL lie primarily with the lack of interpretability. Also, most of the existing DL-based methods formulate the drug and target encoder as two independent modules without considering the relationship between them. In this study, we propose a mutual learning mechanism to bridge the gap between the two encoders. We formulated the DTI problem from a global perspective by inserting mutual learning layers between the two encoders. The mutual learning layer was achieved by multihead attention and position-aware attention. The neural attention mechanism also provides effective visualization, which makes it easier to analyze a model. We evaluated our approach using three benchmark kinase data sets under different experimental settings and compared the proposed method to three baseline models. We found that the four methods yielded similar results in the random split setting (training and test sets share common drugs and targets), while the proposed method increases the predictive performance significantly in the orphan-target and orphan-drug split setting (training and test sets share only targets or drugs). The experimental results demonstrated that the proposed method improved the generalization and interpretation capability of DTI modeling.


Subject(s)
Deep Learning , Organic Chemicals/metabolism , Pharmaceutical Preparations/metabolism , Proteins/metabolism , Organic Chemicals/chemistry , Pharmaceutical Preparations/chemistry , Protein Binding , Proteins/chemistry
12.
Biomed Pharmacother ; 129: 110360, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32559623

ABSTRACT

Several proteins including S-nitrosoglutathione reductase (GSNOR), complement Factor D, complement 3b (C3b) and Protein Kinase R-like Endoplasmic Reticulum Kinase (PERK), have been demonstrated to be involved in pathogenesis pathways for Alzheimer's disease (AD) and considered as potential treatment targets to AD. Based on the concept of multitargets, a network pharmacology-based approach was employed to investigate potential Traditional Chinese Medicine (TCM) candidates that can dock well with GSNOR, C3b, Factor D and PERK proteins. To predict the bioactivities of candidates, Artificial Intelligence (AI) algorithms composed of seven machine learning algorithms and a deep learning model were performed to validate the docking results. Furthermore, in this study, we propose a novel combined method for efficiently exploring the predicted results of AI algorithms. Besides, Comparative force field analysis (CoMFA) and comparative similarity indices analysis (CoMSIA) were performed to construct predicted models. The results show that the square correlation coefficients (R2) of all models are almost higher than 0.75, which also acquire good achievements on the test set. Moreover, the binding stability of the potential inhibitors were evaluated using 100 ns of MD simulation. Collectively, this study elucidate that the herbs Ardisia japonica, Ligusticum chuanxiong, Lippia nodiflora and Mirabilis jalapa containing 2,2'-[benzene-1,4-diylbis(methanediyloxybenzene-4,1-diyl)]bis(oxoacetic acid), Glyasperin B, Nodifloridin A, Miraxanthin III and l-Valine-l-valine anhydride might be a potential medicine formula for AD.


Subject(s)
Alzheimer Disease/drug therapy , Artificial Intelligence , Brain/drug effects , Computer-Aided Design , Drug Design , Drug Discovery , Nootropic Agents/pharmacology , Plant Extracts/pharmacology , Alzheimer Disease/enzymology , Alzheimer Disease/physiopathology , Alzheimer Disease/psychology , Animals , Brain/enzymology , Brain/physiopathology , Cognition/drug effects , Humans , Molecular Docking Simulation , Molecular Dynamics Simulation , Molecular Structure , Molecular Targeted Therapy , Nootropic Agents/chemistry , Plant Extracts/chemistry , Quantitative Structure-Activity Relationship , Signal Transduction
13.
Front Neurorobot ; 14: 617327, 2020.
Article in English | MEDLINE | ID: mdl-33414713

ABSTRACT

Neuroinflammation is a common factor in neurodegenerative diseases, and it has been demonstrated that galectin-3 activates microglia and astrocytes, leading to inflammation. This means that inhibition of galectin-3 may become a new strategy for the treatment of neurodegenerative diseases. Based on this motivation, the objective of this study is to explore an integrated new approach for finding lead compounds that inhibit galectin-3, by combining universal artificial intelligence algorithms with traditional drug screening methods. Based on molecular docking method, potential compounds with high binding affinity were screened out from Chinese medicine database. Manifold artificial intelligence algorithms were performed to validate the docking results and further screen compounds. Among all involved predictive methods, the deep learning-based algorithm made 500 modeling attempts, and the square correlation coefficient of the best trained model on the test sets was 0.9. The XGBoost model reached a square correlation coefficient of 0.97 and a mean square error of only 0.01. We switched to the ZINC database and performed the same experiment, the results showed that the compounds in the former database showed stronger affinity. Finally, we further verified through molecular dynamics simulation that the complex composed of the candidate ligand and the target protein showed stable binding within 100 ns of simulation time. In summary, combined with the application based on artificial intelligence algorithms, we unearthed the active ingredients 1,2-Dimethylbenzene and Typhic acid contained in Crataegus pinnatifida and Typha angustata might be the effective inhibitors of neurodegenerative diseases. The high prediction accuracy of the models shows that it has practical application value on small sample data sets such as drug screening.

14.
Angew Chem Int Ed Engl ; 58(22): 7390-7394, 2019 05 27.
Article in English | MEDLINE | ID: mdl-30958916

ABSTRACT

The daphniphyllum alkaloids are a structurally fascinating and remarkably diverse family of natural products. General strategies for the chemical synthesis of their challenging architectures are highly desirable for efficiently accessing these intriguing alkaloids and addressing their pharmaceutical potential. Herein, a concise strategy designed to provide general and diversifiable access to various daphniphyllum alkaloids is described and utilized in the asymmetric synthesis of (-)-himalensine A, which was accomplished in 14 steps. Key features of this strategy include a Cu-catalyzed nitrile hydration, a Heck reaction to construct the challenging 2-azabicyclo[3.3.1]nonane motif, a Meinwald rearrangement reaction, six, pot-economic reactions, and the minimal use of protecting groups, which significantly improved the overall synthetic efficiency.


Subject(s)
Alkaloids/chemical synthesis , Biological Products/chemical synthesis , Catalysis , Molecular Structure , Stereoisomerism
SELECTION OF CITATIONS
SEARCH DETAIL
...