ABSTRACT
Leishmaniasis is a neglected tropical illness with a wide variety of clinical signs ranging from visceral to cutaneous symptoms, resulting in millions of new cases and thousands of fatalities reported annually. This article provides a bibliometric analysis of the main authors' contributions, institutions, and nations in terms of productivity, citations, and bibliographic linkages to the application of nanoparticles (NPs) for the treatment of leishmania. The study is based on a sample of 524 Scopus documents from 1991 to 2022. Utilising the Bibliometrix R-Tool version 4.0 and VOSviewer software, version 1.6.17 the analysis was developed. We identified crucial subjects associated with the application of NPs in the field of antileishmanial development (NPs and drug formulation for leishmaniasis treatment, animal models, and experiments). We selected research topics that were out of date and oversaturated. Simultaneously, we proposed developing subjects based on multiple analyses of the corpus of published scientific literature (title, abstract, and keywords). Finally, the technique used contributed to the development of a broader and more specific "big picture" of nanomedicine research in antileishmanial studies for future projects.
ABSTRACT
Globally, pesticides are toxic substances with wide applications. However, the widespread use of pesticides has received increasing attention from regulatory agencies due to their various acute and chronic effects on multiple organisms. In this study, Quantitative Structure-Toxicity Relationship (QSTR) models were established using Multiple Linear Regression (MLR) and five Machine Learning (ML) algorithms to predict pesticide toxicity in Americamysis bahia. The most inï¬uential descriptors included in the MLR model are RBF, JGI2, nCbH, nRCOOR, nRSR, nPO4 and 'Cl-090', with positive contributions to the dependent variable (negative decimal logarithm of median lethal concentration at 96-h). The Random Forest (RF) regression model was superior amongst the five ML models. We observed higher values of R2 (0.812) and lower values of RMSE (0.595) and MAE (0.462) in the cross-validation training set and external validation set. Similarly, this study had a high level of fitness and was internally robust and externally predictive compared to models presented in similar studies. The results suggest that the developed QSTR models are suitable for reliably predicting the aquatic toxicity of structurally diverse pesticides and can be used for screening, prioritising new pesticides, filling data gaps and overcoming the limitations of in vivo and in vitro tests.
Subject(s)
Pesticides , Brazil , Linear Models , Nonlinear Dynamics , Pesticides/toxicity , Quantitative Structure-Activity RelationshipABSTRACT
Osteosarcoma is the most common type of primary malignant bone tumor. Although nowadays 5-year survival rates can reach up to 60-70%, acute complications and late effects of osteosarcoma therapy are two of the limiting factors in treatments. We developed a multi-objective algorithm for the repurposing of new anti-osteosarcoma drugs, based on the modeling of molecules with described activity for HOS, MG63, SAOS2, and U2OS cell lines in the ChEMBL database. Several predictive models were obtained for each cell line and those with accuracy greater than 0.8 were integrated into a desirability function for the final multi-objective model. An exhaustive exploration of model combinations was carried out to obtain the best multi-objective model in virtual screening. For the top 1% of the screened list, the final model showed a BEDROC = 0.562, EF = 27.6, and AUC = 0.653. The repositioning was performed on 2218 molecules described in DrugBank. Within the top-ranked drugs, we found: temsirolimus, paclitaxel, sirolimus, everolimus, and cabazitaxel, which are antineoplastic drugs described in clinical trials for cancer in general. Interestingly, we found several broad-spectrum antibiotics and antiretroviral agents. This powerful model predicts several drugs that should be studied in depth to find new chemotherapy regimens and to propose new strategies for osteosarcoma treatment.
ABSTRACT
Breast cancer (BC) is the leading cause of cancer-related death among women and the most commonly diagnosed cancer worldwide. Although in recent years large-scale efforts have focused on identifying new therapeutic targets, a better understanding of BC molecular processes is required. Here we focused on elucidating the molecular hallmarks of BC heterogeneity and the oncogenic mutations involved in precision medicine that remains poorly defined. To fill this gap, we established an OncoOmics strategy that consists of analyzing genomic alterations, signaling pathways, protein-protein interactome network, protein expression, dependency maps in cell lines and patient-derived xenografts in 230 previously prioritized genes to reveal essential genes in breast cancer. As results, the OncoOmics BC essential genes were rationally filtered to 140. mRNA up-regulation was the most prevalent genomic alteration. The most altered signaling pathways were associated with basal-like and Her2-enriched molecular subtypes. RAC1, AKT1, CCND1, PIK3CA, ERBB2, CDH1, MAPK14, TP53, MAPK1, SRC, RAC3, BCL2, CTNNB1, EGFR, CDK2, GRB2, MED1 and GATA3 were essential genes in at least three OncoOmics approaches. Drugs with the highest amount of clinical trials in phases 3 and 4 were paclitaxel, docetaxel, trastuzumab, tamoxifen and doxorubicin. Lastly, we collected ~3,500 somatic and germline oncogenic variants associated with 50 essential genes, which in turn had therapeutic connectivity with 73 drugs. In conclusion, the OncoOmics strategy reveals essential genes capable of accelerating the development of targeted therapies for precision oncology.
Subject(s)
Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Gene Expression Regulation, Neoplastic , Genes, Essential , Mutation , Precision Medicine , Animals , Biomarkers, Tumor/metabolism , Breast Neoplasms/metabolism , Female , Gene Regulatory Networks , High-Throughput Nucleotide Sequencing , Humans , Mice , Prognosis , Protein Interaction Maps , Proteome , Tumor Cells, Cultured , Xenograft Model Antitumor AssaysABSTRACT
Consensus strategy was proved to be highly efficient in the recognition of gene-disease association. Therefore, the main objective of this study was to apply theoretical approaches to explore genes and communities directly involved in breast cancer (BC) pathogenesis. We evaluated the consensus between 8 prioritization strategies for the early recognition of pathogenic genes. A communality analysis in the protein-protein interaction (PPi) network of previously selected genes was enriched with gene ontology, metabolic pathways, as well as oncogenomics validation with the OncoPPi and DRIVE projects. The consensus genes were rationally filtered to 1842 genes. The communality analysis showed an enrichment of 14 communities specially connected with ERBB, PI3K-AKT, mTOR, FOXO, p53, HIF-1, VEGF, MAPK and prolactin signaling pathways. Genes with highest ranking were TP53, ESR1, BRCA2, BRCA1 and ERBB2. Genes with highest connectivity degree were TP53, AKT1, SRC, CREBBP and EP300. The connectivity degree allowed to establish a significant correlation between the OncoPPi network and our BC integrated network conformed by 51 genes and 62 PPi. In addition, CCND1, RAD51, CDC42, YAP1 and RPA1 were functional genes with significant sensitivity score in BC cell lines. In conclusion, the consensus strategy identifies both well-known pathogenic genes and prioritized genes that need to be further explored.
Subject(s)
Algorithms , Breast Neoplasms/metabolism , Female , Gene Expression Regulation, Neoplastic/genetics , Gene Expression Regulation, Neoplastic/physiology , Gene Regulatory Networks/genetics , Gene Regulatory Networks/physiology , Humans , Metabolic Networks and Pathways/genetics , Metabolic Networks and Pathways/physiology , Protein Binding , Signal Transduction/genetics , Signal Transduction/physiologyABSTRACT
This study presents the impact of carbon nanotubes (CNTs) on mitochondrial oxygen mass flux (Jm) under three experimental conditions. New experimental results and a new methodology are reported for the first time and they are based on CNT Raman spectra star graph transform (spectral moments) and perturbation theory. The experimental measures of Jm showed that no tested CNT family can inhibit the oxygen consumption profiles of mitochondria. The best model for the prediction of Jm for other CNTs was provided by random forest using eight features, obtaining test R-squared (R²) of 0.863 and test root-mean-square error (RMSE) of 0.0461. The results demonstrate the capability of encoding CNT information into spectral moments of the Raman star graphs (SG) transform with a potential applicability as predictive tools in nanotechnology and material risk assessments.
ABSTRACT
The current molecular docking study provided the Free Energy of Binding (FEB) for the interaction (nanotoxicity) between VDAC mitochondrial channels of three species (VDAC1-Mus musculus, VDAC1-Homo sapiens, VDAC2-Danio rerio) with SWCNT-H, SWCNT-OH, SWCNT-COOH carbon nanotubes. The general results showed that the FEB values were statistically more negative (p < 0.05) in the following order: (SWCNT-VDAC2-Danio rerio) > (SWCNT-VDAC1-Mus musculus) > (SWCNT-VDAC1-Homo sapiens) > (ATP-VDAC). More negative FEB values for SWCNT-COOH and OH were found in VDAC2-Danio rerio when compared with VDAC1-Mus musculus and VDAC1-Homo sapiens (p < 0.05). In addition, a significant correlation (0.66 > r2 > 0.97) was observed between n-Hamada index and VDAC nanotoxicity (or FEB) for the zigzag topologies of SWCNT-COOH and SWCNT-OH. Predictive Nanoparticles-Quantitative-Structure Binding-Relationship models (nano-QSBR) for strong and weak SWCNT-VDAC docking interactions were performed using Perturbation Theory, regression and classification models. Thus, 405 SWCNT-VDAC interactions were predicted using a nano-PT-QSBR classifications model with high accuracy, specificity, and sensitivity (73-98%) in training and validation series, and a maximum AUROC value of 0.978. In addition, the best regression model was obtained with Random Forest (R2 of 0.833, RMSE of 0.0844), suggesting an excellent potential to predict SWCNT-VDAC channel nanotoxicity. All study data are available at https://doi.org/10.6084/m9.figshare.4802320.v2 .
Subject(s)
Nanotubes, Carbon/chemistry , Humans , Mitochondria/chemistry , Mitochondria/metabolism , Molecular Docking Simulation , Voltage-Dependent Anion Channel 1/chemistry , Voltage-Dependent Anion Channel 1/metabolism , Voltage-Dependent Anion Channel 2/chemistry , Voltage-Dependent Anion Channel 2/metabolism , Voltage-Dependent Anion Channels/chemistry , Voltage-Dependent Anion Channels/metabolismABSTRACT
In the last years, the encryption of system structure information with different network topological indices has been a very active field of research. In the present study, we assembled for the first time a complex network using data obtained from the Immune Epitope Database for fungi species, and we then considered the general topology, the node degree distribution, and the local structure of this network. We also calculated eight node centrality measures for the observed network and compared it with three theoretical models. In view of the results obtained, we may expect that the present approach can become a valuable tool to explore the complexity of this database, as well as for the storage, manipulation, comparison, and retrieval of information contained therein.
Subject(s)
Epitopes/immunology , Fungi/immunology , Data Mining , Databases, Factual , Humans , Models, Theoretical , Neural Networks, ComputerABSTRACT
Studies of the self-aggregation of binary systems are of both theoretical and practical importance. They provide an opportunity to investigate the influence of the molecular structure of the hydrophobe on the nonideality of mixing. On the other hand, linear free energy relationship (LFER) models, such as Hansch's equations, may be used to predict the properties of chemical compounds such as drugs or surfactants. However, the task becomes more difficult once we want to predict simultaneaously the effect over multiple output properties of binary systems of perturbations under multiple input experimental boundary conditions (b(j)). As a consequence, we need computational chemistry or chemoinformatics models that may help us to predict different properties of the autoaggregation process of mixed surfactants under multiple conditions. In this work, we have developed the first model that combines perturbation theory (PT) and LFER ideas. The model uses as input covariance PT operators (CPTOs). CPTOs are calculated as the difference between covariance ΔCov((i)µ(k)) functions before and after multiple perturbations in the binary system. In turn, covariances calculated as the product of two Box-Jenkins operators (BJO) operators. BJOs are used to measure the deviation of the structure of different chemical compounds from a set of molecules measured under a given subset of experimental conditions. The best CPT-LFER model found predicted the effects of 25,000 perturbations over 9 different properties of binary systems. We also reported experimental studies of different experimental properties of the binary system formed by sodium glycodeoxycholate and didodecyldimethylammonium bromide (NaGDC-DDAB). Last, we used our CPT-LFER model to carry out a 1000 data point simulation of the properties of the NaGDC-DDAB system under different conditions not studied experimentally.
ABSTRACT
Unbalanced uptake of Omega 6/Omega 3 (ω-6/ω-3) ratios could increase chronic disease occurrences, such as inflammation, atherosclerosis, or tumor proliferation, and methylation methods for measuring the ruminal microbiome fatty acid (FA) composition/distribution play a vital role in discovering the contribution of food components to ruminant products (e.g., meat and milk) when pursuing a healthy diet. Hansch's models based on Linear Free Energy Relationships (LFERs) using physicochemical parameters, such as partition coefficients, molar refractivity, and polarizability, as input variables (Vk) are advocated. In this work, a new combined experimental and theoretical strategy was proposed to study the effect of ω-6/ω-3 ratios, FA chemical structure, and other factors over FA distribution networks in the ruminal microbiome. In step 1, experiments were carried out to measure long chain fatty acid (LCFA) profiles in the rumen microbiome (bacterial and protozoan), and volatile fatty acids (VFAs) in fermentation media. In step 2, the proportions and physicochemical parameter values of LCFAs and VFAs were calculated under different boundary conditions (cj) like c1 = acid and/or base methylation treatments, c2 = with/without fermentation, c3 = FA distribution phase (media, bacterial, or protozoan microbiome), etc. In step 3, Perturbation Theory (PT) and LFER ideas were combined to develop a PT-LFER model of a FA distribution network using physicochemical parameters (V(k)), the corresponding Box-Jenkins (ΔV(kj)) and PT operators (ΔΔV(kj)) in statistical analysis. The best PT-LFER model found predicted the effects of perturbations over the FA distribution network with sensitivity, specificity, and accuracy > 80% for 407 655 cases in training + external validation series. In step 4, alternative PT-LFER and PT-NLFER models were tested for training Linear and Non-Linear Artificial Neural Networks (ANNs). PT-NLFER models based on ANNs presented better performance but are more complicated than the PT-LFER model. Last, in step 5, the PT-LFER model based on LDA was used to reconstruct the complex networks of perturbations in the FA distribution and compared the giant components of the observed and predicted networks with random Erdos-Rényi network models. In short, our new PT-LFER model is a useful tool for predicting a distribution network in terms of specific fatty acid distribution.
Subject(s)
Computer Simulation , Fatty Acids/metabolism , Animals , Bacteria/metabolism , Catalysis , Fatty Acids, Omega-3/metabolism , Fatty Acids, Volatile/analysis , Male , Methylation , Microbiota , Rumen/microbiology , SheepABSTRACT
Quantitative Structure-Activity (mt-QSAR) techniques may become an important tool for prediction of cytotoxicity and High-throughput Screening (HTS) of drugs to rationalize drug discovery process. In this work, we train and validate by the first time mt-QSAR model using TOPS-MODE approach to calculate drug molecular descriptors and Linear Discriminant Analysis (LDA) function. This model correctly classifies 8258 out of 9000 (Accuracy = 91.76%) multiplexing assay endpoints of 7903 drugs (including both train and validation series). Each endpoint correspond to one out of 1418 assays, 36 molecular and cellular targets, 46 standard type measures, in two possible organisms (human and mouse). After that, we determined experimentally, by the first time, the values of EC50 = 21.58 µg/mL and Cytotoxicity = 23.6% for the anti-microbial/anti-parasite drug G1 over Balb/C mouse peritoneal macrophages using flow cytometry. In addition, the model predicts for G1 only 7 positive endpoints out 1251 cytotoxicity assays (0.56% of probability of cytotoxicity in multiple assays). The results obtained complement the toxicological studies of this important drug. This work adds a new tool to the existing pool of few methods useful for multi-target HTS of ChEMBL and other libraries of compounds towards drug discovery.
Subject(s)
Anti-Infective Agents/toxicity , Flow Cytometry , High-Throughput Screening Assays , Macrophages/drug effects , Animals , Anti-Infective Agents/chemistry , Cell Survival/drug effects , Cells, Cultured , Discriminant Analysis , Humans , Macrophages/cytology , Mice , Mice, Inbred BALB C , Models, Molecular , Quantitative Structure-Activity RelationshipABSTRACT
Leishmaniasis is a growing health problem worldwide. As there are certain drawbacks with the drugs currently used to treat human leishmaniasis and resistance to these drugs is emerging, there is a need to develop novel antileishmanial compounds, among which isoquinoline alkaloids are promising candidates. In this study, 18 novel oxoisoaporphine derivatives were synthesized and their possible antileishmanial activity was evaluated. The in vitro activity of these derivatives against Leishmania amazonensis axenic amastigotes was first evaluated, and the selected compounds were then tested in an inhibition assay with promastigotes of L. infantum, L. braziliensis, L. amazonensis and L. guyanensis, and with intracellular amastigotes of L. infantum and L. amazonensis. Finally, the most active compounds, OXO 1 (2,3-dihydro-7H-dibenzo[de,h]quinolin-7-one) and OXO 13 (2,3,8,9,10,11-hexahydro-7H-dibenzo[de,h]quinolin-7-one), were tested in BALB/c mice infected with L. infantum. Treatment of mice at a dose of 10 mg/kg with OXO 1 yielded significant reductions (p<0.05) in parasite burden in liver and spleen (99% and 78%, respectively) whereas with OXO 13 were not significant. Although previous reports suggest that this family of molecules displays inhibitory activity against monoamine oxidase A and acetylcholinesterase, these enzymes were not confirmed as targets for antileishmanial activity on the basis of the present results. However, after development of a new bioinformatics model to analyze the Leishmania proteome, we were able to identify other putative targets for these molecules. The most promising candidates were four proteins: two putative pteridine reductase 2 (1MXF and 1MXH), one N-myristoyltransferase (2WUU) and one type I topoisomerase (2B9S).
Subject(s)
Alkaloids/pharmacology , Aporphines/pharmacology , Leishmania/drug effects , Leishmaniasis/drug therapy , Acyltransferases/metabolism , Animals , DNA Topoisomerases, Type I/metabolism , Isoquinolines/pharmacology , Mice , Mice, Inbred BALB CABSTRACT
Several pathogen parasite species show different susceptibilities to different antiparasite drugs. Unfortunately, almost all structure-based methods are one-task or one-target Quantitative Structure-Activity Relationships (ot-QSAR) that predict the biological activity of drugs against only one parasite species. Consequently, multi-tasking learning to predict drugs activity against different species by a single model (mt-QSAR) is vitally important. In the two previous works of the present series we reported two single mt-QSAR models in order to predict the antimicrobial activity against different fungal (Bioorg. Med. Chem.2006, 14, 5973-5980) or bacterial species (Bioorg. Med. Chem.2007, 15, 897-902). These mt-QSARs offer a good opportunity (unpractical with ot-QSAR) to construct drug-drug similarity Complex Networks and to map the contribution of sub-structures to function for multiple species. These possibilities were unattended in our previous works. In the present work, we continue this series toward other important direction of chemotherapy (antiparasite drugs) with the development of an mt-QSAR for more than 500 drugs tested in the literature against different parasites. The data were processed by Linear Discriminant Analysis (LDA) classifying drugs as active or non-active against the different tested parasite species. The model correctly classifies 212 out of 244 (87.0%) cases in training series and 207 out of 243 compounds (85.4%) in external validation series. In order to illustrate the performance of the QSAR for the selection of active drugs we carried out an additional virtual screening of antiparasite compounds not used in training or predicting series; the model recognized 97 out of 114 (85.1%) of them. We also give the procedures to construct back-projection maps and to calculate sub-structures contribution to the biological activity. Finally, we used the outputs of the QSAR to construct, by the first time, a multi-species Complex Networks of antiparasite drugs. The network predicted has 380 nodes (compounds), 634 edges (pairs of compounds with similar activity). This network allows us to cluster different compounds and identify on average three known compounds similar to a new query compound according to their profile of biological activity. This is the first attempt to calculate probabilities of antiparasitic action of drugs against different parasites.
Subject(s)
Antiprotozoal Agents/chemistry , Antiprotozoal Agents/therapeutic use , Computer Simulation , Drug Design , Models, Chemical , Quantitative Structure-Activity Relationship , Animals , Databases, Factual , Drug Resistance , Leishmania donovani/drug effects , Leishmania mexicana/drug effects , Markov Chains , Plasmodium falciparum/drug effects , Predictive Value of Tests , Species Specificity , Systems Integration , Trypanosoma brucei brucei/drug effectsABSTRACT
MARCH-INSIDE methodology and a statistical classification method--linear discriminant analysis (LDA)--is proposed as an alternative method to the Draize eye irritation test. This methodology has been successfully applied to a set of 46 neutral organic chemicals, which have been defined as ocular irritant or nonirritant. The model allow to categorize correctly 37 out of 46 compounds, showing an accuracy of 80.46%. Specifically, this model demonstrates the existence of a good categorization average of 91.67 and 76.47% for irritant and nonirritant compounds, respectively. Validation of the model was carried out using two cross-validation tools: Leave-one-out (LOO) and leave-group-out (LGO), showing a global predictability of the model of 71.7 and 70%, respectively. The average of coincidence of the predictions between leave-one-out/leave-group-out studies and train set were 91.3% (42 out of 46 cases)/89.1% (41 out of 46 cases) proving the robustness of the model obtained. Ocular irritancy distribution diagram is carried out in order to determine the intervals of the property where the probability of finding an irritant compound is maximal relating to the choice of find a false nonirritant one. It seems that, until today, the present model may be the first predictive linear discriminant equation able to discriminate between eye irritant and nonirritant chemicals.
Subject(s)
Eye Injuries/chemically induced , Irritants/classification , Markov Chains , Models, Biological , Organic Chemicals/classification , Algorithms , Animal Testing Alternatives/methods , Animals , False Positive Reactions , Humans , Irritants/chemistry , Irritants/toxicity , Organic Chemicals/toxicity , Probability , Quantitative Structure-Activity Relationship , ROC Curve , Toxicity Tests, AcuteABSTRACT
Most of present mathematical models for biological activity consider just the molecular structure. In the present article we pretend extending the use of Markov chain models to define novel molecular descriptors, which consider in addition other parameters like target site or biological effect. Specifically, this mathematical model takes into consideration not only the molecular structure but the specific biological system the drug affects too. Herein, a general Markov model is developed that describes 19 different drugs side effects grouped in eight affected biological systems for 178 drugs, being 270 cases finally. The data was processed by linear discriminant analysis (LDA) classifying drugs according to their specific side effects, forward stepwise was fixed as strategy for variables selection. The average percentage of good classification and number of compounds used in the training/predicting sets were 100/95.8% for endocrine manifestations, (18 out of 18)/(13 out of 14); 90.5/92.3% for gastrointestinal manifestations, (38 out of 42)/(30 out of 32); 88.5/86.5% for systemic phenomena, (23 out of 26)/(17 out of 20); 81.8/77.3% for neurological manifestations, (27 out of 33)/(19 out of 25); 81.6/86.2% for dermal manifestations, (31 out of 38)/(25 out of 29); 78.4/85.1% for cardiovascular manifestation, (29 out of 37)/(24 out of 28); 77.1/75.7% for breathing manifestations, (27 out of 35)/(20 out of 26) and 75.6/75% for psychiatric manifestations, (31 out of 41)/(23 out of 31). Additionally a back-projection analysis (BPA) was carried out for two ulcerogenic drugs to prove in structural terms the physical interpretation of the models obtained. This article develops a mathematical model that encompasses a large number of drugs side effects grouped in specifics biological systems using stochastic absolute probabilities of interaction ((A)pi(k)(j)) by the first time.
Subject(s)
Drug-Related Side Effects and Adverse Reactions , Markov Chains , Models, Biological , Humans , Models, Molecular , Pharmaceutical Preparations/chemistry , Piroxicam/adverse effects , Piroxicam/chemistry , Pyridines/adverse effects , Pyridines/chemistry , Quantitative Structure-Activity Relationship , ThermodynamicsABSTRACT
The development of 2D graph-theoretic representations for DNA sequences was very important for qualitative and quantitative comparison of sequences. Calculation of numeric features for these representations is useful for DNA-QSAR studies. Most of all graph-theoretic representations identify each one of the four bases with a unitary walk in one axe direction in the 2D space. In the case of proteins, twenty amino acids instead of four bases have to be considered. This fact has limited the introduction of useful 2D Cartesian representations and the corresponding sequences descriptors to encode protein sequence information. In this study, we overcome this problem grouping amino acids into four groups: acid, basic, polar and non-polar amino acids. The identification of each group with one of the four axis directions determines a novel 2D representation and numeric descriptors for proteins sequences. Afterwards, a Markov model has been used to calculate new numeric descriptors of the protein sequence. These descriptors are called herein the sequence 2D coupling numbers (zeta(k)). In this work, we calculated the zeta(k) values for 108 sequences of different polygalacturonases (PGs) and for 100 sequences of other proteins. A Linear Discriminant Analysis model derived here (PG=5.36.zeta1-3.98.zeta3-42.21) successfully discriminates between PGs and other proteins. The model correctly classified 100% of a subset of 81 PGs and 75 non-PG proteins sequences used to train the model. The model also correctly classified 51 out of 52 (98.07%) of proteins sequences used as external validation series. The uses of different group of amino acids and/or axes orientation give different results, so it is suggested to be explored for other databases. Finally, to illustrates the use of the model we report the isolation and prediction of the PG action for a novel sequence (AY908988) isolated by our group from Psidium guajava L. This prediction coincides very well with sequence alignment results found by the BLAST methodology. These findings illustrate the possibilities of the sequence descriptors derived for this novel 2D sequence representation in proteins sequence QSAR studies.
Subject(s)
Algorithms , Plant Proteins/genetics , Psidium/genetics , Sequence Analysis, DNA , Software , Amino Acid Sequence , Image Processing, Computer-Assisted , Molecular Sequence Data , Psidium/enzymologyABSTRACT
A Markov model based QSAR is introduced for the rational selection of anticancer compounds. The model discriminates 90.3% of 226 structurally heterogeneous anticancer/non-anticancer compounds in training series. External validation series were used to validate the model; the 91.8% containing 85 compounds, not considered to fit the model, were correctly classified. The model developed is afterwards used in a simulation of a virtual search for anticancer compounds never considered either in training or in predicting series. The 87.7% of the 213 anticancer compounds used in this simulated search were correctly classified. The model also shows high values for specificity (0.89), sensitivity (0.91), and Mathews correlation coefficient (0.79). In addition, the present model compares better-to-similar with respect to other four models elsewhere reported if one takes into consideration 26 comparison parameters. Finally, we exemplify the use of the model in practice with the design of a new series of carbanucleosides. The compounds evaluated with the model were synthesized and experimentally assayed for their antitumor effects on the proliferation of murine leukemia cells (L1210/0) and human T-lymphocyte cells (CEM/0 and Molt4/C8). The more interesting activity was detected for the compound 5a with a predicted probability of 80.2% and IC(50) = 27.0, 27.2, and 29.4 microM, respectively, against the above-mentioned cellular lines. These values are comparable to those for the control compound Ara-A.
Subject(s)
Antineoplastic Agents/chemical synthesis , Antineoplastic Agents/pharmacology , Computational Biology , Drug Evaluation, Preclinical/methods , Entropy , Nucleosides/chemistry , Nucleosides/pharmacology , Purines/chemistry , Animals , Antineoplastic Agents/chemistry , Antineoplastic Agents/toxicity , Cell Line, Tumor , Humans , Mice , Molecular Structure , Nucleosides/chemical synthesis , Nucleosides/toxicity , Quantitative Structure-Activity Relationship , Stochastic ProcessesABSTRACT
Most of present molecular descriptors consider just the molecular structure. In the present article we pretend extending the use of Markov chain (MC) models to define novel molecular descriptors, which consider in addition other parameters like target site or toxic effect. Specifically, this molecular descriptor takes into consideration not only the molecular structure but the specific system the drug affects too. Herein, it is developed a general Markov model that describes 21 different drugs side effects grouped in 10 affected biological systems for 193 drugs, being 311 cases finally. The data were processed by linear discriminant analysis (LDA) classifying drugs according to their specific side effects, forward stepwise was fixed as strategy for variables selection. The average percentage of good classification and number of compounds used in the training/predicting sets were 92.6/91.7% for cardiovascular manifestation (25 out of 27)/(18 out of 20); 89.3/83.9% for dermal manifestations (25 out of 18)/(18 out of 21); 88.9/88.9% for endocrine manifestations (16 out of 18)/(12 out of 14); 88.9/88.2% for psychiatric manifestations (32 out of 36)/(24 out of 27); 88.5/85.6% for systemic phenomena (23 out of 26)/(17 out of 20); 85.7/91.7% for gastrointestinal manifestations (36 out of 42)/(29 out of 32); 83.3/79.2% for metabolic manifestations (15 out of 18)/(11 out of 14); 81.8/78.0% for neurological manifestations (27 out of 33)/(20 out of 25); 75.0/74.0% for hematological manifestations (36 out of 48)/(27 out of 36) and 74.3/72.8% for breathing manifestations (26 out of 35)/(19 out of 26). Finally, application of back-projection analysis (BPA) provides physic interpretation in structural terms through molecular graphics of the toxic effects predicted with these QSTR models. This article develops a mathematical model that encompasses a large number of drugs side effects grouped in specifics systems using stochastic entropies of interaction (Thetak (j)) by the first time.
Subject(s)
Drug Interactions , Entropy , Markov Chains , Models, Chemical , Molecular Structure , Stochastic ProcessesABSTRACT
Proteins 3D-QSAR is an emerging field of bioorganic chemistry. However, the large dimensions of the structures to be handled may become a bottleneck to scaling up classic QSAR problems for proteins. In this sense, truncation approach could be used as in molecular dynamic to perform timely calculations. The spherical truncation of electrostatic field with different functions breaks down long-range interactions at a given cutoff distance (r(off)) resulting in short-range ones. Consequently, a Markov chain model may approach to the average electrostatic potentials of spatial distribution of charges within the protein backbone. These average electrostatic potentials can be used to predict proteins properties. Herein, we explore the effect of abrupt, shifting, force shifting, and switching truncation functions on 3D-QSAR models classifying 26 proteins with different functions: lysozymes, dihydrofolate reductases, and alcohol dehydrogenases. Almost all methods have shown overall accuracies higher than 73%. The present result points to an acceptable robustness of the MC for different truncation schemes and r(off) values. The results of best accuracy 92% with abrupt truncation coincide with our recent communication. We also developed models with the same accuracy value for other truncation functions; however they are more complex functions. PCA analysis for 152 non-homologous proteins has shown that there are five main eigenvalues, which explain more than 87% of the variance of the studied properties. The present molecular descriptors may encode structural information not totally accounted for the previous ones, so success with these descriptors could be expected when classic fails. The present result confirms the utility of our Markov models combined with truncation approach to generate bioorganic structure protein molecular descriptors for QSAR.
Subject(s)
Markov Chains , Proteins/chemistry , Static Electricity , Quantitative Structure-Activity RelationshipABSTRACT
The carcinogenic activity has been investigated by using a topological substructural molecular design approach (TOPS-MODE). A discriminant model was developed to predict the carcinogenic and noncarcinogenic activity on a data set of 189 compounds. The percentage of correct classification was 76.32%. The predictive power of the model was validated by three test: an external test set (compounds not used in the develop of the model, with a 72.97% of good classification), a leave-group-out cross-validation procedure (4-fold full cross-validation, removing 20% of compounds in each cycle, with a good prediction of 76.31%) and two external prediction sets (the first and second exercises of the National Toxicology Program). This methodology evidenced that the hydrophobicity increase the carcinogenic activity and the dipole moment of the molecule decrease it; suggesting the capacity of the TOPS-MODE descriptors to estimate this property for new drug candidates. Finally, the positive and negative fragment contributions to the carcinogenic activity were identified (structural alerts) and their potentialities in the lead generation process and in the design of 'safer' chemicals were evaluated.