Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
J Chem Inf Model ; 62(16): 3928-3940, 2022 08 22.
Article in English | MEDLINE | ID: mdl-35946598

ABSTRACT

In this work, the SOFT.PTML tool has been used to pre-process a ChEMBL dataset of pre-clinical assays of antileishmanial compound candidates. A comparative study of different ML algorithms, such as logistic regression (LOGR), support vector machine (SVM), and random forests (RF), has shown that the IFPTML-LOGR model presents excellent values of specificity and sensitivity (81-98%) in training and validation series. The use of this software has been illustrated with a practical case study focused on a series of 28 derivatives of 2-acylpyrroles 5a,b, obtained through a Pd(II)-catalyzed C-H radical acylation of pyrroles. Their in vitro leishmanicidal activity against visceral (L. donovani) and cutaneous (L. amazonensis) leishmaniasis was evaluated finding that compounds 5bc (IC50 = 30.87 µM, SI > 10.17) and 5bd (IC50 = 16.87 µM, SI > 10.67) were approximately 6-fold more selective than the drug of reference (miltefosine) in in vitro assays against L. amazonensis promastigotes. In addition, most of the compounds showed low cytotoxicity, CC50 > 100 µg/mL in J774 cells. Interestingly, the IFPMTL-LOGR model predicts correctly the relative biological activity of these series of acylpyrroles. A computational high-throughput screening (cHTS) study of 2-acylpyrroles 5a,b has been performed calculating >20,700 activity scores vs a large space of 647 assays involving multiple Leishmania species, cell lines, and potential target proteins. Overall, the study demonstrates that the SOFT.PTML all-in-one strategy is useful to obtain IFPTML models in a friendly interface making the work easier and faster than before. The present work also points to 2-acylpyrroles as new lead compounds worthy of further optimization as antileishmanial hits.


Subject(s)
Antiprotozoal Agents , Leishmania , Antiprotozoal Agents/pharmacology , Cell Line
2.
Int J Mol Sci ; 22(23)2021 Dec 02.
Article in English | MEDLINE | ID: mdl-34884870

ABSTRACT

The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information-Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (cassayj = caj and cdataj = cdj) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (caj) or about the nature and quality of data (cdj). These categorical variables include information about 22 parameters of biological activity (ca0), 28 target proteins (ca1), and 9 organisms of assay (ca2), etc. We also created another partition of (cprotj = cpj) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (cp0), 10 chromosomes (cp1), gene orientation (cp2), and 31 protein functions (cp3). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon's entropy measure Shk (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of Plasmodium.


Subject(s)
Antimalarials/pharmacology , Drug Discovery/methods , Machine Learning , Plasmodium falciparum/genetics , Algorithms , Antimalarials/chemistry , Databases, Pharmaceutical , Drug Evaluation, Preclinical , Genome, Protozoan , Markov Chains , Models, Theoretical , Protozoan Proteins/chemistry , Protozoan Proteins/genetics , Protozoan Proteins/metabolism , Reproducibility of Results
3.
Nanoscale ; 13(2): 1318-1330, 2021 Jan 21.
Article in English | MEDLINE | ID: mdl-33410431

ABSTRACT

Nanoparticles are useful antimicrobial drug-release systems, but some nanoparticles also exhibit antibacterial activity. However, investigation of their antibacterial activity is a difficult and slow process due to the numerous combinations of nanoparticle size, shape, and composition vs. biological tests, assay organisms, and multiple activity parameters to be measured. Additionally, the overuse of antibiotics has led to the emergence of resistant bacterial strains with different metabolic networks. Computational models may speed up this process, but the models reported to date do not to consider all the previous factors, and the data sources are dispersed and not curated. Thus, herein, we used an information fusion, perturbation-theory machine learning (IFPTML) approach, which is introduced by us for the first time, to fit a model for the discovery of antibacterial nanoparticles. The dataset studied had 15 classes of nanoparticles (1-100 nm) with most cases in the range of 1-50 nm vs. >20 pathogenic bacteria species with different metabolic networks. The nanoparticles studied included metal nanoparticles of Au, Ag, and Cu; oxide nanoparticles of Zn, Cu, La, Al, Fe, Sn, Ti, Cd, and Si; and metal salt nanoparticles of CuI and CdS. We used the SOFT.PTML software (our own application) with a user-friendly interface for the IFPTML calculations and a control statistics package. Using SOFT.PTML, we found a linear logistic regression equation that could model 4 biological activity parameters using only 8 variables with χ2 = 2265.75, p-level <0.05, sensitivity, Sn = 79.4, and specificity, Sp = 99.3, for 3213 cases (nanoparticle-bacteria pairs) in the training series. The model had Sn = 80.8 and Sp = 99.3 for 2114 cases in the external validation series. We also developed a random forest non-linear model with higher values of Sn and Sp = 98-99% in the training/validation series, although it was more complicated to use. SOFT.PTML has been demonstrated to be a useful tool for the analysis of complex data in nanotechnology. We also introduced a new anabolism-catabolism unbalance index of metabolic networks to reveal the biological connotation of the IFPTML predictions for antibacterial nanoparticles. These new models open a new door for the discovery of NPs vs. new bacterial species and strains with different topological structures of their metabolic networks.


Subject(s)
Metal Nanoparticles , Nanoparticles , Anti-Bacterial Agents/pharmacology , Drug Liberation , Machine Learning , Metabolic Networks and Pathways , Microbial Sensitivity Tests
4.
Curr Top Med Chem ; 20(25): 2326-2337, 2020.
Article in English | MEDLINE | ID: mdl-32938352

ABSTRACT

By combining Machine Learning (ML) methods with Perturbation Theory (PT), it is possible to develop predictive models for a variety of response targets. Such combination often known as Perturbation Theory Machine Learning (PTML) modeling comprises a set of techniques that can handle various physical, and chemical properties of different organisms, complex biological or material systems under multiple input conditions. In so doing, these techniques effectively integrate a manifold of diverse chemical and biological data into a single computational framework that can then be applied for screening lead chemicals as well as to find clues for improving the targeted response(s). PTML models have thus been extremely helpful in drug or material design efforts and found to be predictive and applicable across a broad space of systems. After a brief outline of the applied methodology, this work reviews the different uses of PTML in Medicinal Chemistry, as well as in other applications. Finally, we cover the development of software available nowadays for setting up PTML models from large datasets.


Subject(s)
Databases, Chemical , Machine Learning , Software , Chemistry, Pharmaceutical , Models, Molecular
5.
J Proteome Res ; 17(3): 1258-1268, 2018 03 02.
Article in English | MEDLINE | ID: mdl-29336158

ABSTRACT

The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and specificity 70-80% in training and external validation series. All of these results may point to a possible biological relevance of gene orientation inversion not directly dependent on genetic sequence information. This work opens the gate to the use of GOINs as a tool for the study of the structure of chromosomes and the study of protein function in proteome research.


Subject(s)
Chromosomes/chemistry , Gene Regulatory Networks , Genes, Protozoan , Membrane Proteins/genetics , Plasmodium falciparum/genetics , Proteome/genetics , Protozoan Proteins/genetics , Sequence Inversion , Erythrocytes/parasitology , Gene Expression Regulation , Humans , Machine Learning , Membrane Proteins/metabolism , Multigene Family , Plasmodium falciparum/metabolism , Protein Isoforms/genetics , Protein Isoforms/metabolism , Proteome/metabolism , Protozoan Proteins/metabolism , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...