Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
Sci Rep ; 13(1): 19561, 2023 Nov 10.
Article in English | MEDLINE | ID: mdl-37949930

ABSTRACT

Machine learning (ML) algorithms are extensively used in pharmaceutical research. Most ML models have black-box character, thus preventing the interpretation of predictions. However, rationalizing model decisions is of critical importance if predictions should aid in experimental design. Accordingly, in interdisciplinary research, there is growing interest in explaining ML models. Methods devised for this purpose are a part of the explainable artificial intelligence (XAI) spectrum of approaches. In XAI, the Shapley value concept originating from cooperative game theory has become popular for identifying features determining predictions. The Shapley value concept has been adapted as a model-agnostic approach for explaining predictions. Since the computational time required for Shapley value calculations scales exponentially with the number of features used, local approximations such as Shapley additive explanations (SHAP) are usually required in ML. The support vector machine (SVM) algorithm is one of the most popular ML methods in pharmaceutical research and beyond. SVM models are often explained using SHAP. However, there is only limited correlation between SHAP and exact Shapley values, as previously demonstrated for SVM calculations using the Tanimoto kernel, which limits SVM model explanation. Since the Tanimoto kernel is a special kernel function mostly applied for assessing chemical similarity, we have developed the Shapley value-expressed radial basis function (SVERAD), a computationally efficient approach for the calculation of exact Shapley values for SVM models based upon radial basis function kernels that are widely applied in different areas. SVERAD is shown to produce meaningful explanations of SVM predictions.

2.
iScience ; 25(9): 105023, 2022 Sep 16.
Article in English | MEDLINE | ID: mdl-36105596

ABSTRACT

The support vector machine (SVM) algorithm is popular in chemistry and drug discovery. SVM models have black box character. Their predictions can be interpreted through feature weighting or the model-agnostic Shapley additive explanations (SHAP) formalism that locally approximates Shapley values (SVs) originating from game theory. We introduce an algorithm termed SV-expressed Tanimoto similarity (SVETA) for the exact calculation of SVs to explain SVM models employing the Tanimoto kernel, the gold standard for the assessment of molecular similarity. For a model system, the exact calculation of SVs is demonstrated. In an SVM-based compound classification task from drug discovery, only a limited correlation between exact SV and SHAP values is observed, prohibiting the use of approximate values for rationalizing predictions. For exemplary test compounds, atom-based mapping of prioritized features delineates coherent substructures that closely resemble those obtained by analyzing independently derived random forest models, thus providing consistent explanations.

3.
iScience ; 25(10): 105043, 2022 Oct 21.
Article in English | MEDLINE | ID: mdl-36134335

ABSTRACT

Graph neural networks (GNNs) recursively propagate signals along the edges of an input graph, integrate node feature information with graph structure, and learn object representations. Like other deep neural network models, GNNs have notorious black box character. For GNNs, only few approaches are available to rationalize model decisions. We introduce EdgeSHAPer, a generally applicable method for explaining GNN-based models. The approach is devised to assess edge importance for predictions. Therefore, EdgeSHAPer makes use of the Shapley value concept from game theory. For proof-of-concept, EdgeSHAPer is applied to compound activity prediction, a central task in drug discovery. EdgeSHAPer's edge centricity is relevant for molecular graphs where edges represent chemical bonds. Combined with feature mapping, EdgeSHAPer produces intuitive explanations for compound activity predictions. Compared to a popular node-centric and another edge-centric GNN explanation method, EdgeSHAPer reveals higher resolution in differentiating features determining predictions and identifies minimal pertinent positive feature sets.

4.
Mol Inform ; 41(12): e2200190, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36002382

ABSTRACT

In drug discovery, polypharmacology encompasses the use of small molecules with defined multi-target activity and in vivo effects resulting from multi-target engagement. Multi-target compounds are often efficacious in the treatment of complex diseases involving target and pathway networks, but might also elicit unwanted side effects. Computational approaches such as target prediction or multi-target ligand design have been used to support polypharmacological drug discovery. In addition to efforts directed at the identification or design of new multi-target compounds, other computational investigations have aimed to differentiate such compounds from potential false-positives or explore the molecular basis of multi-target activities. Herein, a concise overview of the field is provided and recent advances in computational polypharmacology through machine learning are discussed.

5.
Biomolecules ; 12(4)2022 04 08.
Article in English | MEDLINE | ID: mdl-35454147

ABSTRACT

Protein kinases are major drug targets. Most kinase inhibitors are directed against the adenosine triphosphate (ATP) cofactor binding site, which is largely conserved across the human kinome. Hence, such kinase inhibitors are often thought to be promiscuous. However, experimental evidence and activity data for publicly available kinase inhibitors indicate that this is not generally the case. We have investigated whether inhibitors of closely related human kinases with single- or multi-kinase activity can be differentiated on the basis of chemical structure. Therefore, a test system consisting of two distinct kinase triplets has been devised for which inhibitors with reported triple-kinase activities and corresponding single-kinase activities were assembled. Machine learning models derived on the basis of chemical structure distinguished between these multi- and single-kinase inhibitors with high accuracy. A model-independent explanatory approach was applied to identify structural features determining accurate predictions. For both kinase triplets, the analysis revealed decisive features contained in multi-kinase inhibitors. These features were found to be absent in corresponding single-kinase inhibitors, thus providing a rationale for successful machine learning. Mapping of features determining accurate predictions revealed that they formed coherent and chemically meaningful substructures that were characteristic of multi-kinase inhibitors compared with single-kinase inhibitors.


Subject(s)
Protein Kinase Inhibitors , Protein Kinases , Binding Sites , Humans , Machine Learning , Protein Kinase Inhibitors/chemistry , Protein Kinase Inhibitors/pharmacology , Protein Kinases/metabolism
6.
Sci Rep ; 11(1): 21594, 2021 11 03.
Article in English | MEDLINE | ID: mdl-34732806

ABSTRACT

Compounds with defined multi-target activity play an increasingly important role in drug discovery. Structural features that might be signatures of such compounds have mostly remained elusive thus far. We have explored the potential of explainable machine learning to uncover structural motifs that are characteristic of dual-target compounds. For a pharmacologically relevant target pair-based test system designed for our study, accurate prediction models were derived and the influence of molecular representation features of test compounds was quantified to explain the predictions. The analysis revealed small numbers of specific features whose presence in dual-target and absence in single-target compounds determined accurate predictions. These features formed coherent substructures in dual-target compounds. From computational analysis of specific feature contributions, structural motifs emerged that were confirmed to be signatures of different dual-target activities. Our findings demonstrate the ability of explainable machine learning to bridge between predictions and intuitive chemical analysis and reveal characteristic substructures of dual-target compounds.

7.
Future Sci OA ; 7(5): FSO685, 2021 Mar 11.
Article in English | MEDLINE | ID: mdl-34046190

ABSTRACT

AIM: Providing compound data sets for promiscuity analysis with single-target (ST) and multi-target (MT) activity, taking confirmed inactivity against targets into account. METHODOLOGY: Compounds and target annotations are extracted from screening assays. For a given combination of targets, MT and ST compounds are identified, ensuring test data completeness. EXEMPLARY RESULTS & DATA: A total of 1242 MT compounds active against five or more targets and 6629 corresponding ST compounds are characterized, organized and made freely available. LIMITATIONS & NEXT STEPS: Screening campaigns typically cover a smaller target space than compounds from the medicinal chemistry literature and their activity annotations might be of lesser quality. Reported compound groups will be subjected to target set-based promiscuity analysis and predictions.

8.
Sci Rep ; 11(1): 7863, 2021 04 12.
Article in English | MEDLINE | ID: mdl-33846469

ABSTRACT

Compounds with defined multi-target activity (promiscuity) play an increasingly important role in drug discovery. However, the molecular basis of multi-target activity is currently only little understood. In particular, it remains unclear whether structural features exist that generally characterize promiscuous compounds and set them apart from compounds with single-target activity. We have devised a test system using machine learning to systematically examine structural features that might characterize compounds with multi-target activity. Using this system, more than 860,000 diagnostic predictions were carried out. The analysis provided compelling evidence for the presence of structural characteristics of promiscuous compounds that were dependent on given target combinations, but not generalizable. Feature weighting and mapping identified characteristic substructures in test compounds. Taken together, these findings are relevant for the design of compounds with desired multi-target activity.

9.
Mol Inform ; 40(1): e2000196, 2021 01.
Article in English | MEDLINE | ID: mdl-32881355

ABSTRACT

Compounds with the ability to interact with multiple targets, also called promiscuous compounds, provide the basis for polypharmacological drug discovery. In recent years, a plethora of structural analogs with different promiscuity has been identified. Nevertheless, the molecular origins of promiscuity remain to be elucidated. In this study, we systematically extracted different structural analogs with varying promiscuity using the matched molecular pair (MMP) formalism from public biological screening and medicinal chemistry data. Care was taken to eliminate all compounds with potential false-positive activity annotations from the analysis. Promiscuity predictions were then attempted at the level of compound pairs representing promiscuity cliffs (PCs; formed by analogs with large promiscuity differences) and corresponding non-PC MMPs (analog pairs without significant promiscuity differences). To address this prediction task, different machine learning models were generated and the results were compared with single compound predictions. PCs encoding promiscuity differences were found to contain more structure-promiscuity relationship information than sets of individual promiscuous compounds. In addition, feature analysis was carried out revealing key contributions to the correct prediction of PCs and non-PC MMPs via machine learning.


Subject(s)
Machine Learning , Polypharmacology , Deep Learning , Humans , Structure-Activity Relationship
10.
Biomolecules ; 10(12)2020 11 27.
Article in English | MEDLINE | ID: mdl-33260876

ABSTRACT

Predicting compounds with single- and multi-target activity and exploring origins of compound specificity and promiscuity is of high interest for chemical biology and drug discovery. We present a large-scale analysis of compound promiscuity including two major components. First, high-confidence datasets of compounds with multi- and corresponding single-target activity were extracted from biological screening data. Positive and negative assay results were taken into account and data completeness was ensured. Second, these datasets were investigated using diagnostic machine learning to systematically distinguish between compounds with multi- and single-target activity. Models built on the basis of chemical structure consistently produced meaningful predictions. These findings provided evidence for the presence of structural features differentiating promiscuous and non-promiscuous compounds. Machine learning under varying conditions using modified datasets revealed a strong influence of nearest neighbor relationship on the predictions. Many multi-target compounds were found to be more similar to other multi-target compounds than single-target compounds and vice versa, which resulted in consistently accurate predictions. The results of our study confirm the presence of structural relationships that differentiate promiscuous and non-promiscuous compounds.


Subject(s)
Machine Learning , Pharmaceutical Preparations/chemistry , Drug Discovery , Drug Evaluation, Preclinical
11.
Mol Pharm ; 17(12): 4652-4666, 2020 12 07.
Article in English | MEDLINE | ID: mdl-33151084

ABSTRACT

Small molecules with multitarget activity are capable of triggering polypharmacological effects and are of high interest in drug discovery. Compared to single-target compounds, promiscuity also affects drug distribution and pharmacodynamics and alters ADMET characteristics. Features distinguishing between compounds with single- and multitarget activity are currently only little understood. On the basis of systematic data analysis, we have assembled large sets of promiscuous compounds with activity against related or functionally distinct targets and the corresponding compounds with single-target activity. Machine learning predicted promiscuous compounds with surprisingly high accuracy. Molecular similarity analysis combined with control calculations under varying conditions revealed that accurate predictions were largely determined by structural nearest-neighbor relationships between compounds from different classes. We also found that large proportions of promiscuous compounds with activity against related or unrelated targets and corresponding single-target compounds formed analog series with distinct chemical space coverage, which further rationalized the predictions. Moreover, compounds with activity against proteins from functionally distinct classes were often active against unique targets that were not covered by other promiscuous compounds. The results of our analysis revealed that nearest-neighbor effects determined the prediction of promiscuous compounds and that preferential partitioning of compounds with single- and multitarget activity into structurally distinct analog series was responsible for such effects, hence providing a rationale for the presence of different structure-promiscuity relationships.


Subject(s)
Drug Discovery/methods , Machine Learning , Polypharmacology , Data Analysis , Molecular Structure , Structure-Activity Relationship
12.
Bioorg Med Chem Lett ; 30(18): 127420, 2020 09 15.
Article in English | MEDLINE | ID: mdl-32763808

ABSTRACT

A library of cathepsin S inhibitors of the dipeptide nitrile chemotype, bearing a bioisosteric sulfonamide moiety, was synthesized. Kinetic investigations were performed at four human cysteine proteases, i.e. cathepsins S, B, K and L. Compound 12 with a terminal 3-biphenyl sulfonamide substituent was the most potent (Ki = 4.02 nM; selectivity ratio cathepsin S/K = 5.8; S/L = 67) and 24 with a 4'-fluoro-4-biphenyl sulfonamide substituent the most selective cathepsin S inhibitor (Ki = 35.5 nM; selectivity ratio cathepsin S/K = 57; S/L = 31). In silico design and biochemical evaluation emphasized the impact of the sulfonamide linkage on selectivity and a possible switch of P2 and P3 substituents with respect to the occupation of the corresponding binding sites of cathepsin S.


Subject(s)
Cathepsins/antagonists & inhibitors , Dipeptides/chemical synthesis , Enzyme Inhibitors/chemical synthesis , Nitriles/chemical synthesis , Sulfonamides/chemistry , Amino Acid Sequence , Binding Sites , Cathepsin K/metabolism , Cathepsin L/metabolism , Computer Simulation , Cysteine Proteases/metabolism , Humans , Kinetics , Protein Binding , Structure-Activity Relationship
13.
Int J Mol Sci ; 21(11)2020 May 27.
Article in English | MEDLINE | ID: mdl-32471121

ABSTRACT

(1) Background: Compounds with multitarget activity are of interest in basic research to explore molecular foundations of promiscuous binding and in drug discovery as agents eliciting polypharmacological effects. Our study has aimed to systematically identify compounds that form complexes with proteins from distinct classes and compare their bioactive conformations and molecular properties. (2) Methods: A large-scale computational investigation was carried out that combined the analysis of complex X-ray structures, ligand binding modes, compound activity data, and various molecular properties. (3) Results: A total of 515 ligands with multitarget activity were identified that included 70 organic compounds binding to proteins from different classes. These multiclass ligands (MCLs) were often flexible and surprisingly hydrophilic. Moreover, they displayed a wide spectrum of binding modes. In different target structure environments, binding shapes of MCLs were often similar, but also distinct. (4) Conclusions: Combined structural and activity data analysis identified compounds with activity against proteins with distinct structures and functions. MCLs were found to have greatly varying shape similarity when binding to different protein classes. Hence, there were no apparent canonical binding shapes indicating multitarget activity. Rather, conformational versatility characterized MCL binding.


Subject(s)
Cheminformatics , Proteins/chemistry , Proteins/metabolism , Crystallography, X-Ray , Hydrogen Bonding , Indomethacin/chemistry , Indomethacin/metabolism , Kanamycin/chemistry , Kanamycin/metabolism , Ligands , Lipids/chemistry , Protein Binding
14.
PLoS Negl Trop Dis ; 14(3): e0007755, 2020 03.
Article in English | MEDLINE | ID: mdl-32163418

ABSTRACT

The cysteine protease cruzipain is considered to be a validated target for therapeutic intervention in the treatment of Chagas disease. A series of 26 new compounds were designed, synthesized, and tested against the recombinant cruzain (Cz) to map its S1/S1´ subsites. The same series was evaluated on a panel of four human cysteine proteases (CatB, CatK, CatL, CatS) and Leishmania mexicana CPB, which is a potential target for the treatment of cutaneous leishmaniasis. The synthesized compounds are dipeptidyl nitriles designed based on the most promising combinations of different moieties in P1 (ten), P2 (six), and P3 (four different building blocks). Eight compounds exhibited a Ki smaller than 20.0 nM for Cz, whereas three compounds met these criteria for LmCPB. Three inhibitors had an EC50 value of ca. 4.0 µM, thus being equipotent to benznidazole according to the antitrypanosomal effects. Our mapping approach and the respective structure-activity relationships provide insights into the specific ligand-target interactions for therapeutically relevant cysteine proteases.


Subject(s)
Cysteine Proteinase Inhibitors/pharmacology , Dipeptides/pharmacology , Leishmania mexicana/enzymology , Nitriles/pharmacology , Protozoan Proteins/antagonists & inhibitors , Trypanocidal Agents/pharmacology , Trypanosoma cruzi/enzymology , Cysteine Endopeptidases , Cysteine Proteases/metabolism , Humans
15.
Molecules ; 25(4)2020 Feb 12.
Article in English | MEDLINE | ID: mdl-32059498

ABSTRACT

In pharmaceutical research, compounds with multitarget activity receive increasing attention. Such promiscuous chemical entities are prime candidates for polypharmacology, but also prone to causing undesired side effects. In addition, understanding the molecular basis and magnitude of multitarget activity is a stimulating topic for exploratory research. Computationally, compound promiscuity can be estimated through large-scale analysis of activity data. To these ends, it is critically important to take data confidence criteria and data consistency across different sources into consideration. Especially the consistency aspect has thus far only been little investigated. Therefore, we have systematically determined activity annotations and profiles of known multitarget ligands (MTLs) on the basis of activity data from different sources. All MTLs used were confirmed by X-ray crystallography of complexes with multiple targets. One of the key questions underlying our analysis has been how MTLs act in biological screens. The results of our analysis revealed significant variations of MTL activity profiles originating from different data sources. Such variations must be carefully considered in promiscuity analysis. Our study raises awareness of these issues and provides guidance for large-scale activity data analysis.


Subject(s)
Ligands , Molecular Structure , Polypharmacology , Crystallography, X-Ray , Drug Design , Humans , Structure-Activity Relationship
16.
Molecules ; 24(22)2019 Nov 18.
Article in English | MEDLINE | ID: mdl-31752252

ABSTRACT

Compounds with multitarget activity are of high interest for polypharmacological drug discovery. Such promiscuous compounds might be active against closely related target proteins from the same family or against distantly related or unrelated targets. Compounds with activity against distinct targets are not only of interest for polypharmacology but also to better understand how small molecules might form specific interactions in different binding site environments. We have aimed to identify compounds with activity against drug targets from different classes. To these ends, a systematic analysis of public biological screening data was carried out. Care was taken to exclude compounds from further consideration that were prone to experimental artifacts and false positive activity readouts. Extensively assayed compounds were identified and found to contain molecules that were consistently inactive in all assays, active against a single target, or promiscuous. The latter included more than 1000 compounds that were active against 10 or more targets from different classes. These multiclass ligands were further analyzed and exemplary compounds were found in X-ray structures of complexes with distinct targets. Our collection of multiclass ligands should be of interest for pharmaceutical applications and further exploration of binding characteristics at the molecular level. Therefore, these highly promiscuous compounds are made publicly available.


Subject(s)
Drug Delivery Systems , Drug Discovery , Polypharmacology , Humans , Ligands , Proteins/drug effects , Structure-Activity Relationship
17.
J Med Chem ; 62(23): 10497-10525, 2019 12 12.
Article in English | MEDLINE | ID: mdl-31361135

ABSTRACT

Cysteine proteases are important targets for the discovery of novel therapeutics for many human diseases. From parasitic diseases to cancer, cysteine proteases follow a common mechanism, the formation of an encounter complex with subsequent nucleophilic reactivity of the catalytic cysteine thiol group toward the carbonyl carbon of a peptide bond or an electrophilic group of an inhibitor. Modulation of target enzymes occurs preferably by covalent modification, which imposes challenges in balancing cross-reactivity and selectivity. Given the resurgence of irreversible covalent inhibitors, can they impair off-target effects or are reversible covalent inhibitors a better route to selectivity? This Perspective addresses how small molecule inhibitors may achieve selectivity for different cathepsins, cruzain, rhodesain, and falcipain-2. We discuss target- and ligand-based designs emphasizing repurposing inhibitors from one cysteine protease to others.


Subject(s)
Cathepsins/metabolism , Cysteine Proteases/metabolism , Cysteine Proteinase Inhibitors/pharmacology , Animals , Protein Binding
SELECTION OF CITATIONS
SEARCH DETAIL
...