Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
1.
Int J Surg ; 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38833337

ABSTRACT

BACKGROUND: Warfarin is a common oral anticoagulant, and its effects vary widely among individuals. Numerous dose-prediction algorithms have been reported based on cross-sectional data generated via multiple linear regression or machine learning. This study aimed to construct an information fusion perturbation theory and machine learning prediction model of warfarin blood levels based on clinical longitudinal data from cardiac surgery patients. METHODS AND MATERIAL: The data of 246 patients were obtained from electronic medical records. Continuous variables were processed by calculating the distance of the raw data with the moving average (MA ∆vki(sj)), and categorical variables in different attribute groups were processed using Euclidean distance (ED ǁ∆vk(sj)ǁ). Regression and classification analyses were performed on the raw data, MA ∆vki(sj), and ED ǁ∆vk(sj)ǁ. Different machine-learning algorithms were chosen for the STATISTICA and WEKA software. RESULTS: The random forest (RF) algorithm was the best for predicting continuous outputs using the raw data. The correlation coefficients of the RF algorithm were 0.978 and 0.595 for the training and validation sets, respectively, and the mean absolute errors were 0.135 and 0.362 for the training and validation sets, respectively. The proportion of ideal predictions of the RF algorithm was 59.0%. General discriminant analysis (GDA) was the best algorithm for predicting the categorical outputs using the MA ∆vki(sj) data. The GDA algorithm's total true positive rate (TPR) was 95.4% and 95.6% for the training and validation sets, respectively, with MA ∆vki(sj) data. CONCLUSIONS: An information fusion perturbation theory and machine learning model for predicting warfarin blood levels was established. A model based on the RF algorithm could be used to predict the target international normalized ratio (INR), and a model based on the GDA algorithm could be used to predict the probability of being within the target INR range under different clinical scenarios.

2.
Beilstein J Nanotechnol ; 15: 535-555, 2024.
Article in English | MEDLINE | ID: mdl-38774585

ABSTRACT

Neurodegenerative diseases are characterized by slowly progressing neuronal cell death. Conventional drug treatment strategies often fail because of poor solubility, low bioavailability, and the inability of the drugs to effectively cross the blood-brain barrier. Therefore, the development of new neurodegenerative disease drugs (NDDs) requires immediate attention. Nanoparticle (NP) systems are of increasing interest for transporting NDDs to the central nervous system. However, discovering effective nanoparticle neuronal disease drug delivery systems (N2D3Ss) is challenging because of the vast number of combinations of NP and NDD compounds, as well as the various assays involved. Artificial intelligence/machine learning (AI/ML) algorithms have the potential to accelerate this process by predicting the most promising NDD and NP candidates for assaying. Nevertheless, the relatively limited amount of reported data on N2D3S activity compared to assayed NDDs makes AI/ML analysis challenging. In this work, the IFPTML technique, which combines information fusion (IF), perturbation theory (PT), and machine learning (ML), was employed to address this challenge. Initially, we conducted the fusion into a unified dataset comprising 4403 NDD assays from ChEMBL and 260 NP cytotoxicity assays from journal articles. Through a resampling process, three new working datasets were generated, each containing 500,000 cases. We utilized linear discriminant analysis (LDA) along with artificial neural network (ANN) algorithms, such as multilayer perceptron (MLP) and deep learning networks (DLN), to construct linear and non-linear IFPTML models. The IFPTML-LDA models exhibited sensitivity (Sn) and specificity (Sp) values in the range of 70% to 73% (>375,000 training cases) and 70% to 80% (>125,000 validation cases), respectively. In contrast, the IFPTML-MLP and IFPTML-DLN achieved Sn and Sp values in the range of 85% to 86% for both training and validation series. Additionally, IFPTML-ANN models showed an area under the receiver operating curve (AUROC) of approximately 0.93 to 0.95. These results indicate that the IFPTML models could serve as valuable tools in the design of drug delivery systems for neurosciences.

3.
Biomed Pharmacother ; 174: 116602, 2024 May.
Article in English | MEDLINE | ID: mdl-38636396

ABSTRACT

The development of new molecules for the treatment of calmodulin related cardiovascular or neurodegenerative diseases is an interesting goal. In this work, we introduce a novel strategy with four main steps: (1) chemical synthesis of target molecules, (2) Förster Resonance Energy Transfer (FRET) biosensor development and in vitro biological assay of new derivatives, (3) Cheminformatics models development and in vivo activity prediction, and (4) Docking studies. This strategy is illustrated with a case study. Firstly, a series of 4-substituted Riluzole derivatives 1-3 were synthetized through a strategy that involves the construction of the 4-bromoriluzole framework and its further functionalization via palladium catalysis or organolithium chemistry. Next, a FRET biosensor for monitoring Ca2+-dependent CaM-ligands interactions has been developed and used for the in vitro assay of Riluzole derivatives. In particular, the best inhibition (80%) was observed for 4-methoxyphenylriluzole 2b. Besides, we trained and validated a new Networks Invariant, Information Fusion, Perturbation Theory, and Machine Learning (NIFPTML) model for predicting probability profiles of in vivo biological activity parameters in different regions of the brain. Next, we used this model to predict the in vivo activity of the compounds experimentally studied in vitro. Last, docking study conducted on Riluzole and its derivatives has provided valuable insights into their binding conformations with the target protein, involving calmodulin and the SK4 channel. This new combined strategy may be useful to reduce assay costs (animals, materials, time, and human resources) in the drug discovery process of calmodulin inhibitors.


Subject(s)
Biosensing Techniques , Calmodulin , Molecular Docking Simulation , Neuroprotective Agents , Riluzole , Calmodulin/antagonists & inhibitors , Calmodulin/metabolism , Biosensing Techniques/methods , Neuroprotective Agents/pharmacology , Neuroprotective Agents/chemical synthesis , Neuroprotective Agents/chemistry , Riluzole/pharmacology , Riluzole/chemical synthesis , Riluzole/chemistry , Fluorescence Resonance Energy Transfer , Animals , Humans , Machine Learning
4.
J Chem Inf Model ; 64(6): 1841-1852, 2024 Mar 25.
Article in English | MEDLINE | ID: mdl-38466369

ABSTRACT

The Flaviviridae family consists of single-stranded positive-sense RNA viruses, which contains the genera Flavivirus, Hepacivirus, Pegivirus, and Pestivirus. Currently, there is an outbreak of viral diseases caused by this family affecting millions of people worldwide, leading to significant morbidity and mortality rates. Advances in computational chemistry have greatly facilitated the discovery of novel drugs and treatments for diseases associated with this family. Chemoinformatic techniques, such as the perturbation theory machine learning method, have played a crucial role in developing new approaches based on ML models that can effectively aid drug discovery. The IFPTML models have shown its capability to handle, classify, and process large data sets with high specificity. The results obtained from different models indicates that this methodology is proficient in processing the data, resulting in a reduction of the false positive rate by 4.25%, along with an accuracy of 83% and reliability of 92%. These values suggest that the model can serve as a computational tool in assisting drug discovery efforts and the development of new treatments against Flaviviridae family diseases.


Subject(s)
Flaviviridae Infections , Flaviviridae , Humans , Flaviviridae/genetics , Reproducibility of Results , Drug Discovery , Computer Simulation
5.
Phytomedicine ; 128: 155479, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38493714

ABSTRACT

BACKGROUND: Warfarin is a widely prescribed anticoagulant in the clinic. It has a more considerable individual variability, and many factors affect its variability. Mathematical models can quantify the quantitative impact of these factors on individual variability. PURPOSE: The aim is to comprehensively analyze the advanced warfarin dosing algorithm based on pharmacometrics and machine learning models of personalized warfarin dosage. METHODS: A bibliometric analysis of the literature retrieved from PubMed and Scopus was performed using VOSviewer. The relevant literature that reported the precise dosage of warfarin calculation was retrieved from the database. The multiple linear regression (MLR) algorithm was excluded because a recent systematic review that mainly reviewed this algorithm has been reported. The following terms of quantitative systems pharmacology, mechanistic model, physiologically based pharmacokinetic model, artificial intelligence, machine learning, pharmacokinetic, pharmacodynamic, pharmacokinetics, pharmacodynamics, and warfarin were added as MeSH Terms or appearing in Title/Abstract into query box of PubMed, then humans and English as filter were added to retrieve the literature. RESULTS: Bibliometric analysis revealed important co-occuring MeShH and index keywords. Further, the United States, China, and the United Kingdom were among the top countries contributing in this domain. Some studies have established personalized warfarin dosage models using pharmacometrics and machine learning-based algorithms. There were 54 related studies, including 14 pharmacometric models, 31 artificial intelligence models, and 9 model evaluations. Each model has its advantages and disadvantages. The pharmacometric model contains biological or pharmacological mechanisms in structure. The process of pharmacometric model development is very time- and labor-intensive. Machine learning is a purely data-driven approach; its parameters are more mathematical and have less biological interpretation. However, it is faster, more efficient, and less time-consuming. Most published models of machine learning algorithms were established based on cross-sectional data sourced from the database. CONCLUSION: Future research on personalized warfarin medication should focus on combining the advantages of machine learning and pharmacometrics algorithms to establish a more robust warfarin dosage algorithm. Randomized controlled trials should be performed to evaluate the established algorithm of warfarin dosage. Moreover, a more user-friendly and accessible warfarin precision medicine platform should be developed.


Subject(s)
Anticoagulants , Machine Learning , Precision Medicine , Warfarin , Warfarin/pharmacokinetics , Warfarin/pharmacology , Anticoagulants/pharmacokinetics , Anticoagulants/pharmacology , Anticoagulants/administration & dosage , Humans , Precision Medicine/methods , Bibliometrics , Algorithms
6.
Adv Sci (Weinh) ; 11(13): e2305177, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38258479

ABSTRACT

Familial hypercholesterolemia (FH) is an inherited metabolic disease affecting cholesterol metabolism, with 90% of cases caused by mutations in the LDL receptor gene (LDLR), primarily missense mutations. This study aims to integrate six commonly used predictive software to create a new model for predicting LDLR mutation pathogenicity and mapping hot spot residues. Six predictive-software are selected: Polyphen-2, SIFT, MutationTaster, REVEL, VARITY, and MLb-LDLr. Software accuracy is tested with the characterized variants annotated in ClinVar and, by bioinformatic and machine learning techniques all models are integrated into a more accurate one. The resulting optimized model presents a specificity of 96.71% and a sensitivity of 98.36%. Hot spot residues with high potential of pathogenicity appear across all domains except for the signal peptide and the O-linked domain. In addition, translating this information into 3D structure of the LDLr highlights potentially pathogenic clusters within the different domains, which may be related to specific biological function. The results of this work provide a powerful tool to classify LDLR pathogenic variants. Moreover, an open-access guide user interface (OptiMo-LDLr) is provided to the scientific community. This study shows that combination of several predictive software results in a more accurate prediction to help clinicians in FH diagnosis.


Subject(s)
Hyperlipoproteinemia Type II , Humans , Phenotype , Mutation , Hyperlipoproteinemia Type II/diagnosis , Hyperlipoproteinemia Type II/genetics , Receptors, LDL/genetics , Receptors, LDL/metabolism , Computer Simulation
7.
J Cheminform ; 16(1): 9, 2024 Jan 23.
Article in English | MEDLINE | ID: mdl-38254200

ABSTRACT

The enantioselective Brønsted acid-catalyzed α-amidoalkylation reaction is a useful procedure is for the production of new drugs and natural products. In this context, Chiral Phosphoric Acid (CPA) catalysts are versatile catalysts for this type of reactions. The selection and design of new CPA catalysts for different enantioselective reactions has a dual interest because new CPA catalysts (tools) and chiral drugs or materials (products) can be obtained. However, this process is difficult and time consuming if approached from an experimental trial and error perspective. In this work, an Heuristic Perturbation-Theory and Machine Learning (HPTML) algorithm was used to seek a predictive model for CPA catalysts performance in terms of enantioselectivity in α-amidoalkylation reactions with R2 = 0.96 overall for training and validation series. It involved a Monte Carlo sampling of > 100,000 pairs of query and reference reactions. In addition, the computational and experimental investigation of a new set of intermolecular α-amidoalkylation reactions using BINOL-derived N-triflylphosphoramides as CPA catalysts is reported as a case of study. The model was implemented in a web server called MATEO: InterMolecular Amidoalkylation Theoretical Enantioselectivity Optimization, available online at: https://cptmltool.rnasa-imedir.com/CPTMLTools-Web/mateo . This new user-friendly online computational tool would enable sustainable optimization of reaction conditions that could lead to the design of new CPA catalysts along with new organic synthesis products.

8.
J Chem Inf Model ; 62(16): 3928-3940, 2022 08 22.
Article in English | MEDLINE | ID: mdl-35946598

ABSTRACT

In this work, the SOFT.PTML tool has been used to pre-process a ChEMBL dataset of pre-clinical assays of antileishmanial compound candidates. A comparative study of different ML algorithms, such as logistic regression (LOGR), support vector machine (SVM), and random forests (RF), has shown that the IFPTML-LOGR model presents excellent values of specificity and sensitivity (81-98%) in training and validation series. The use of this software has been illustrated with a practical case study focused on a series of 28 derivatives of 2-acylpyrroles 5a,b, obtained through a Pd(II)-catalyzed C-H radical acylation of pyrroles. Their in vitro leishmanicidal activity against visceral (L. donovani) and cutaneous (L. amazonensis) leishmaniasis was evaluated finding that compounds 5bc (IC50 = 30.87 µM, SI > 10.17) and 5bd (IC50 = 16.87 µM, SI > 10.67) were approximately 6-fold more selective than the drug of reference (miltefosine) in in vitro assays against L. amazonensis promastigotes. In addition, most of the compounds showed low cytotoxicity, CC50 > 100 µg/mL in J774 cells. Interestingly, the IFPMTL-LOGR model predicts correctly the relative biological activity of these series of acylpyrroles. A computational high-throughput screening (cHTS) study of 2-acylpyrroles 5a,b has been performed calculating >20,700 activity scores vs a large space of 647 assays involving multiple Leishmania species, cell lines, and potential target proteins. Overall, the study demonstrates that the SOFT.PTML all-in-one strategy is useful to obtain IFPTML models in a friendly interface making the work easier and faster than before. The present work also points to 2-acylpyrroles as new lead compounds worthy of further optimization as antileishmanial hits.


Subject(s)
Antiprotozoal Agents , Leishmania , Antiprotozoal Agents/pharmacology , Cell Line
9.
JACC Basic Transl Sci ; 6(11): 815-827, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34869944

ABSTRACT

Untreated familial hypercholesterolemia (FH) leads to atherosclerosis and early cardiovascular disease. Mutations in the low-density lipoprotein receptor (LDLr) gene constitute the major cause of FH, and the high number of mutations already described in the LDLr makes necessary cascade screening or in vitro functional characterization to provide a definitive diagnosis. Implementation of high-predicting capacity software constitutes a valuable approach for assessing pathogenicity of LDLr variants to help in the early diagnosis and management of FH disease. This work provides a reliable machine learning model to accurately predict the pathogenicity of LDLr missense variants with specificity of 92.5% and sensitivity of 91.6%.

10.
Int J Mol Sci ; 22(21)2021 Oct 26.
Article in English | MEDLINE | ID: mdl-34768951

ABSTRACT

The theoretical prediction of drug-decorated nanoparticles (DDNPs) has become a very important task in medical applications. For the current paper, Perturbation Theory Machine Learning (PTML) models were built to predict the probability of different pairs of drugs and nanoparticles creating DDNP complexes with anti-glioblastoma activity. PTML models use the perturbations of molecular descriptors of drugs and nanoparticles as inputs in experimental conditions. The raw dataset was obtained by mixing the nanoparticle experimental data with drug assays from the ChEMBL database. Ten types of machine learning methods have been tested. Only 41 features have been selected for 855,129 drug-nanoparticle complexes. The best model was obtained with the Bagging classifier, an ensemble meta-estimator based on 20 decision trees, with an area under the receiver operating characteristic curve (AUROC) of 0.96, and an accuracy of 87% (test subset). This model could be useful for the virtual screening of nanoparticle-drug complexes in glioblastoma. All the calculations can be reproduced with the datasets and python scripts, which are freely available as a GitHub repository from authors.


Subject(s)
Antineoplastic Agents/administration & dosage , Brain Neoplasms/drug therapy , Drug Delivery Systems , Glioblastoma/drug therapy , Machine Learning , Nanoparticles , Databases, Chemical , Databases, Pharmaceutical , Drug Carriers/administration & dosage , Drug Design , Drug Screening Assays, Antitumor , Humans , Nanoparticles/administration & dosage , User-Computer Interface
11.
Eur J Med Chem ; 220: 113458, 2021 Aug 05.
Article in English | MEDLINE | ID: mdl-33901901

ABSTRACT

The development of new molecules for the treatment of leishmaniasis is, a neglected parasitic disease, is urgent as current anti-leishmanial therapeutics are hampered by drug toxicity and resistance. The pyrrolo[1,2-b]isoquinoline core was selected as starting point, and palladium-catalyzed Heck-initiated cascade reactions were developed for the synthesis of a series of C-10 substituted derivatives. Their in vitro leishmanicidal activity against visceral (L. donovani) and cutaneous (L. amazonensis) leishmaniasis was evaluated. The best activity was found, in general, for the 10-arylmethyl substituted pyrroloisoquinolines. In particular, 2ad (IC50 = 3.30 µM, SI > 77.01) and 2bb (IC50 = 3.93 µM, SI > 58.77) were approximately 10-fold more potent and selective than the drug of reference (miltefosine), against L. amazonensis on in vitro promastigote assays, while 2ae was the more active compound in the in vitro amastigote assays (IC50 = 33.59 µM, SI > 8.93). Notably, almost all compounds showed low cytotoxicity, CC50 > 100 µg/mL in J774 cells, highest tested dose. In addition, we have developed the first Perturbation Theory Machine Learning (PTML) algorithm able to predict simultaneously multiple biological activity parameters (IC50, Ki, etc.) vs. any Leishmania species and target protein, with high values of specificity (>98%) and sensitivity (>90%) in both training and validation series. Therefore, this model may be useful to reduce time and assay costs (material and human resources) in the drug discovery process.


Subject(s)
Antiprotozoal Agents/pharmacology , Isoquinolines/pharmacology , Leishmania/drug effects , Leishmaniasis/drug therapy , Palladium/chemistry , Algorithms , Antiprotozoal Agents/chemical synthesis , Antiprotozoal Agents/chemistry , Dose-Response Relationship, Drug , Isoquinolines/chemical synthesis , Isoquinolines/chemistry , Leishmaniasis/parasitology , Molecular Structure , Parasitic Sensitivity Tests , Structure-Activity Relationship
12.
Bioorg Chem ; 109: 104745, 2021 04.
Article in English | MEDLINE | ID: mdl-33640629

ABSTRACT

The developing of antibacterial resistance is becoming in crisis. In this sense, natural products play a fundamental role in the discovery of antibacterial agents with diverse mechanisms of action. Phytochemical investigation of Cissus incisa leaves led to isolation and characterization of the ceramides mixture (1): (8E)-2-(tritriacont-9-enoyl amino)-1,3,4-octadecanetriol-8-ene (1-I); (8E)-2-(2',3'-dihydroxyoctacosanoyl amino)-1,3,4-octadecanetriol-8-ene (1-II); (8E)-2-(2'-hydroxyheptacosanoyl amino)-1,3,4-octadecanetriol-8-ene (1-III); and (8E)-2-(-2'-hydroxynonacosanoyl amino)-1,3,4-octadecanetriol-8-ene (1-IV). Until now, this is the first report of the ceramides (1-I), (1-II), and (1-IV). The structures were elucidated using NMR and mass spectrometry analyses. Antibacterial activity of ceramides (1) and acetylated derivates (2) was evaluated against nine multidrug-resistant bacteria by Microdilution method. (1) showed the best results against Gram-negatives, mainly against carbapenems-resistant Acinetobacter baumannii with MIC = 50 µg/mL. Structure-activity analysis and molecular docking revealed interactions between plant ceramides with membrane proteins, and enzymes associated with biological membranes of Gram-negative bacteria, through hydrogen bonding of functional groups. Vesicular contents release assay showed the capacity of (1) to disturb membrane permeability detected by an increase of fluorescence probe over time. The membrane disruption is not caused for ceramides lytic action on cell membranes, according in vitro hemolyticactivity results. Combining SAR analysis, bioinformatics and biophysical techniques, and also experimental tests, it was possible to explain the antibacterial action of these natural ceramides.


Subject(s)
Acinetobacter baumannii/drug effects , Anti-Bacterial Agents/pharmacology , Ceramides/pharmacology , Cissus/chemistry , Molecular Docking Simulation , Anti-Bacterial Agents/chemistry , Anti-Bacterial Agents/isolation & purification , Ceramides/chemistry , Ceramides/isolation & purification , Dose-Response Relationship, Drug , Drug Resistance, Bacterial/drug effects , Microbial Sensitivity Tests , Molecular Structure , Structure-Activity Relationship
13.
ACS Chem Neurosci ; 12(1): 203-215, 2021 01 06.
Article in English | MEDLINE | ID: mdl-33347281

ABSTRACT

This work describes the synthesis and pharmacological evaluation of 2-furoyl-based Melanostatin (MIF-1) peptidomimetics as dopamine D2 modulating agents. Eight novel peptidomimetics were tested for their ability to enhance the maximal effect of tritiated N-propylapomorphine ([3H]-NPA) at D2 receptors (D2R). In this series, 2-furoyl-l-leucylglycinamide (6a) produced a statistically significant increase in the maximal [3H]-NPA response at 10 pM (11 ± 1%), comparable to the effect of MIF-1 (18 ± 9%) at the same concentration. This result supports previous evidence that the replacement of proline residue by heteroaromatic scaffolds are tolerated at the allosteric binding site of MIF-1. Biological assays performed for peptidomimetic 6a using cortex neurons from 19-day-old Wistar-Kyoto rat embryos suggest that 6a displays no neurotoxicity up to 100 µM. Overall, the pharmacological and toxicological profile and the structural simplicity of 6a makes this peptidomimetic a potential lead compound for further development and optimization, paving the way for the development of novel modulating agents of D2R suitable for the treatment of CNS-related diseases. Additionally, the pharmacological and biological data herein reported, along with >20 000 outcomes of preclinical assays, was used to seek a general model to predict the allosteric modulatory potential of molecular candidates for a myriad of target receptors, organisms, cell lines, and biological activity parameters based on perturbation theory (PT) ideas and machine learning (ML) techniques, abbreviated as ALLOPTML. By doing so, ALLOPTML shows high specificity Sp = 89.2/89.4%, sensitivity Sn = 71.3/72.2%, and accuracy Ac = 86.1%/86.4% in training/validation series, respectively. To the best of our knowledge, ALLOPTML is the first general-purpose chemoinformatic tool using a PTML-based model for the multioutput and multicondition prediction of allosteric compounds, which is expected to save both time and resources during the early drug discovery of allosteric modulators.


Subject(s)
MSH Release-Inhibiting Hormone , Macrophage Migration-Inhibitory Factors , Peptidomimetics , Allosteric Regulation , Animals , Dopamine , Intramolecular Oxidoreductases , MSH Release-Inhibiting Hormone/pharmacology , Machine Learning , Peptidomimetics/pharmacology , Rats , Rats, Inbred WKY
14.
ACS Omega ; 5(42): 27211-27220, 2020 Oct 27.
Article in English | MEDLINE | ID: mdl-33134682

ABSTRACT

Sarcomas are a group of malignant neoplasms of connective tissue with a different etiology than carcinomas. The efforts to discover new drugs with antisarcoma activity have generated large datasets of multiple preclinical assays with different experimental conditions. For instance, the ChEMBL database contains outcomes of 37,919 different antisarcoma assays with 34,955 different chemical compounds. Furthermore, the experimental conditions reported in this dataset include 157 types of biological activity parameters, 36 drug targets, 43 cell lines, and 17 assay organisms. Considering this information, we propose combining perturbation theory (PT) principles with machine learning (ML) to develop a PTML model to predict antisarcoma compounds. PTML models use one function of reference that measures the probability of a drug being active under certain conditions (protein, cell line, organism, etc.). In this paper, we used a linear discriminant analysis and neural network to train and compare PT and non-PT models. All the explored models have an accuracy of 89.19-95.25% for training and 89.22-95.46% in validation sets. PTML-based strategies have similar accuracy but generate simplest models. Therefore, they may become a versatile tool for predicting antisarcoma compounds.

15.
Curr Top Med Chem ; 20(25): 2326-2337, 2020.
Article in English | MEDLINE | ID: mdl-32938352

ABSTRACT

By combining Machine Learning (ML) methods with Perturbation Theory (PT), it is possible to develop predictive models for a variety of response targets. Such combination often known as Perturbation Theory Machine Learning (PTML) modeling comprises a set of techniques that can handle various physical, and chemical properties of different organisms, complex biological or material systems under multiple input conditions. In so doing, these techniques effectively integrate a manifold of diverse chemical and biological data into a single computational framework that can then be applied for screening lead chemicals as well as to find clues for improving the targeted response(s). PTML models have thus been extremely helpful in drug or material design efforts and found to be predictive and applicable across a broad space of systems. After a brief outline of the applied methodology, this work reviews the different uses of PTML in Medicinal Chemistry, as well as in other applications. Finally, we cover the development of software available nowadays for setting up PTML models from large datasets.


Subject(s)
Databases, Chemical , Machine Learning , Software , Chemistry, Pharmaceutical , Models, Molecular
16.
Biology (Basel) ; 9(8)2020 Jul 30.
Article in English | MEDLINE | ID: mdl-32751710

ABSTRACT

Drug-decorated nanoparticles (DDNPs) have important medical applications. The current work combined Perturbation Theory with Machine Learning and Information Fusion (PTMLIF). Thus, PTMLIF models were proposed to predict the probability of nanoparticle-compound/drug complexes having antimalarial activity (against Plasmodium). The aim is to save experimental resources and time by using a virtual screening for DDNPs. The raw data was obtained by the fusion of experimental data for nanoparticles with compound chemical assays from the ChEMBL database. The inputs for the eight Machine Learning classifiers were transformed features of drugs/compounds and nanoparticles as perturbations of molecular descriptors in specific experimental conditions (experiment-centered features). The resulting dataset contains 107 input features and 249,992 examples. The best classification model was provided by Random Forest, with 27 selected features of drugs/compounds and nanoparticles in all experimental conditions considered. The high performance of the model was demonstrated by the mean Area Under the Receiver Operating Characteristics (AUC) in a test subset with a value of 0.9921 ± 0.000244 (10-fold cross-validation). The results demonstrated the power of information fusion of the experimental-centered features of drugs/compounds and nanoparticles for the prediction of nanoparticle-compound antimalarial activity. The scripts and dataset for this project are available in the open GitHub repository.

17.
Nanoscale ; 12(25): 13471-13483, 2020 Jul 02.
Article in English | MEDLINE | ID: mdl-32613998

ABSTRACT

Nanoparticles (NPs) decorated with coating agents (polymers, gels, proteins, etc.) form Nanoparticle Drug Delivery Systems (DDNS), which are of high interest in nanotechnology and biomaterials science. There have been increasing reports of experimental data sets of biological activity, toxicity, and delivery properties of DDNS. However, these data sets are still dispersed and not as large as the datasets of DDNS components (NP and drugs). This has prompted researchers to train Machine Learning (ML) algorithms that are able to design new DDNS based on the properties of their components. However, most ML models reported up to date predictions of the specific activities of NP or drugs over a determined target or cell line. In this paper, we combine Perturbation Theory and Machine Learning (PTML algorithm) to train a model that is able to predict the best components (NP, coating agent, and drug) for DDNS design. In so doing, we downloaded a dataset of >30 000 preclinical assays of drugs from ChEMBL. We also downloaded an NP data set formed by preclinical assays of coated Metal Oxide Nanoparticles (MONPs) from public sources. Both the drugs and NP datasets of preclinical assays cover multiple conditions of assays that can be listed as two arrays, namely, cjdrug and cjNP. The cjdrug array includes >504 biological activity parameters (c0drug), >340 target proteins (c1drug), >650 types of cells (c2drug), >120 assay organisms (c3drug), and >60 assay strains (c4drug). On the other hand, the cjNP array includes 3 biological activity parameters (c0NP), 40 types of proteins (c1NP), 10 shapes of nanoparticles (c2NP), 6 assay media (c3NP), and 12 coating agents (c4NP). After downloading, we pre-processed both the data sets by separate calculation PT operators that are able to account for changes (perturbations) in the drug, coating agents, and NP chemical structure and/or physicochemical properties as well as for the assay conditions. Next, we carry out an information fusion process to form a final dataset of above 500 000 DDNS (drug + MONP pairs). We also trained other linear and non-linear PTML models using R studio scripts for comparative purposes. To the best of our knowledge, this is the first multi-label PTML model that is useful for the selection of drugs, coating agents, and metal or metal-oxide nanoparticles to be assembled in order to design new DDNS with optimal activity/toxicity profiles.


Subject(s)
Nanoparticles , Pharmaceutical Preparations , Algorithms , Drug Liberation , Machine Learning
18.
Mol Pharm ; 17(7): 2612-2627, 2020 07 06.
Article in English | MEDLINE | ID: mdl-32459098

ABSTRACT

Nanosystems are gaining momentum in pharmaceutical sciences because of the wide variety of possibilities for designing these systems to have specific functions. Specifically, studies of new cancer cotherapy drug-vitamin release nanosystems (DVRNs) including anticancer compounds and vitamins or vitamin derivatives have revealed encouraging results. However, the number of possible combinations of design and synthesis conditions is remarkably high. In addition, a large number of anticancer and vitamin derivatives have been already assayed, but a notably less number of cases of DVRNs were assayed as a whole (with the anticancer compound and the vitamin linked to them). Our approach combines with the perturbation theory and machine learning (PTML) model to predict the probability of obtaining an interesting DVRN by changing the anticancer compound and/or the vitamin present in a DVRN that is already tested for other anticancer compounds or vitamins that have not been tested yet as part of a DVRN. In a previous work, we built a linear PTML model useful for the design of these nanosystems. In doing so, we used information fusion (IF) techniques to carry out data enrichment of DVRN data compiled from the literature with the data for preclinical assays of vitamins from the ChEMBL database. The design features of DVRNs and the assay conditions of nanoparticles (NPs) and vitamins were included as multiplicative PT operators (PTOs) to the system, which indicates the importance of these variables. However, the previous work omitted experiments with nonlinear ML techniques and different types of PTOs such as metric-based PTOs. More importantly, the previous work does not consider the structure of the anticancer drug to be included in the new DVRNs. In this work, we are going to accomplish three main objectives (tasks). In the first task, we found a new model, alternative to the one published before, for the rational design of DVRNs using metric-based PTOs. The most accurate PTML model was the artificial neural network model, which showed values of specificity, sensitivity, and accuracy in the range of 90-95% in training and external validation series for more than 130,000 cases (DVRNs vs ChEMBL assays). Furthermore, in the second task, we used IF techniques to carry out data enrichment of our previous data set. In doing so, we constructed a new working data set of >970,000 cases with the data of preclinical assays of DVRNs, vitamins, and anticancer compounds from the ChEMBL database. All these assays have multiple continuous variables or descriptors dk and categorical variables cj (conditions of the assays) for drugs (dack, cacj), vitamins (dvk, cvj), and NPs (dnk, cnj). These data include >20,000 potential anticancer compounds with >270 protein targets (cac1), >580 assay cell organisms (cac2), and so forth. Furthermore, we include >36,000 assay vitamin derivatives in >6200 types of cells (c2vit), >120 assay organisms (c3vit), >60 assay strains (c4vit), and so forth. The enriched data set also contains >20 types of DVRNs (c5n) with 9 NP core materials (c4n), 8 synthesis methods (c7n), and so forth. We expressed all this information with PTOs and developed a qualitatively new PTML model that incorporates information of the anticancer drugs. This new model presents 96-97% of accuracy for training and external validation subsets. In the last task, we carried out a comparative study of ML and/or PTML models published and described how the models we are presenting cover the gap of knowledge in terms of drug delivery. In conclusion, we present here for the first time a multipurpose PTML model that is able to select NPs, anticancer compounds, and vitamins and their conditions of assay for DVRN design.


Subject(s)
Antineoplastic Agents/administration & dosage , Antineoplastic Combined Chemotherapy Protocols/administration & dosage , Drug Delivery Systems/methods , Nanoparticles/chemistry , Neoplasms/drug therapy , Vitamins/administration & dosage , Big Data , Computer Simulation , Databases, Factual , Drug Liberation , Linear Models , Machine Learning
19.
Curr Top Med Chem ; 20(9): 720-730, 2020.
Article in English | MEDLINE | ID: mdl-32066360

ABSTRACT

AIMS: Computational modelling may help us to detect the more important factors governing this process in order to optimize it. BACKGROUND: The generation of hazardous organic waste in teaching and research laboratories poses a big problem that universities have to manage. METHODS: In this work, we report on the experimental measurement of waste generation on the chemical education laboratories within our department. We measured the waste generated in the teaching laboratories of the Organic Chemistry Department II (UPV/EHU), in the second semester of the 2017/2018 academic year. Likewise, to know the anthropogenic and social factors related to the generation of waste, a questionnaire has been utilized. We focused on all students of Experimentation in Organic Chemistry (EOC) and Organic Chemistry II (OC2) subjects. It helped us to know their prior knowledge about waste, awareness of the problem of separate organic waste and the correct use of the containers. These results, together with the volumetric data, have been analyzed with statistical analysis software. We obtained two Perturbation-Theory Machine Learning (PTML) models including chemical, operational, and academic factors. The dataset analyzed included 6050 cases of laboratory practices vs. practices of reference. RESULTS: These models predict the values of acetone waste with R2 = 0.88 and non-halogenated waste with R2 = 0.91. CONCLUSION: This work opens a new gate to the implementation of more sustainable techniques and a circular economy with the aim of improving the quality of university education processes.


Subject(s)
Chemistry, Organic , Chemistry, Pharmaceutical , Environmental Pollutants/chemistry , Laboratories , Acetanilides/chemical synthesis , Butanes/chemical synthesis , Computer Simulation , Humans , Learning , Machine Learning , Models, Statistical , Pentanols/chemical synthesis , Software , Students , Teaching , Universities
20.
ACS Comb Sci ; 22(3): 129-141, 2020 03 09.
Article in English | MEDLINE | ID: mdl-32011854

ABSTRACT

Determining the biological activity of vitamin derivatives is needed given that organic synthesis of analogs of vitamins is an active field of interest for medicinal chemistry, pharmaceuticals, and food additives. Accordingly, scientists from different disciplines perform preclinical assays (nij) with a considerable combination of assay conditions (cj). Indeed, the ChEMBL platform contains a database that includes results from 36 220 different biological activity bioassays of 21 240 different vitamins and vitamin derivatives. These assays present are heterogeneous in terms of assay combinations of cj. They are focused on >500 different biological activity parameters (c0), >340 different targets (c1), >6200 types of cell (c2), >120 organisms of assay (c3), and >60 assay strains (c4). It includes a total of >1850 niacin assays, >1580 tretinoin assays, >1580 retinol assays, 857 ascorbic acid assays, etc. Given the complexity of this combinatorial data in terms of being assimilated by researchers, we propose to build a model by combining perturbation theory (PT) and machine learning (ML). Through this study, we propose a PTML (PT + ML) combinatorial model for ChEMBL results on biological activity of vitamins and vitamins derivatives. The linear discriminant analysis (LDA) model presented the following results for training subset a: specificity (%) = 90.38, sensitivity (%) = 87.51, and accuracy (%) = 89.89. The model showed the following results for the external validation subset: specificity (%) = 90.58, sensitivity (%) = 87.72, and accuracy (%) = 90.09. Different types of linear and nonlinear PTML models, such as logistic regression (LR), classification tree (CT), näive Bayes (NB), and random Forest (RF), were applied to contrast the capacity of prediction. The PTML-LDA model predicts with more accuracy by applying combinatorial descriptors. In addition, a PCA experiment with chemical structure descriptors allowed us to characterize the high structural diversity of the chemical space studied. In any case, PTML models using chemical structure descriptors do not improve the performance of the PTML-LDA model based on ALOGP and PSA. We can conclude that the three variable PTML-LDA model is a simplified and adaptable tool for the prediction, for different experiment combinations, the biological activity of derivative vitamins.


Subject(s)
Bayes Theorem , Combinatorial Chemistry Techniques , Machine Learning , Models, Statistical , Vitamins/chemistry , Databases, Factual , Molecular Structure , Vitamins/chemical synthesis
SELECTION OF CITATIONS
SEARCH DETAIL
...