Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 110
Filtrar
1.
Chem Res Toxicol ; 37(6): 825-826, 2024 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-38769907
2.
J Cheminform ; 16(1): 39, 2024 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-38576047

RESUMO

Stakeholders of machine learning models desire explainable artificial intelligence (XAI) to produce human-understandable and consistent interpretations. In computational toxicity, augmentation of text-based molecular representations has been used successfully for transfer learning on downstream tasks. Augmentations of molecular representations can also be used at inference to compare differences between multiple representations of the same ground-truth. In this study, we investigate the robustness of eight XAI methods using test-time augmentation for a molecular-representation model in the field of computational toxicity prediction. We report significant differences between explanations for different representations of the same ground-truth, and show that randomized models have similar variance. We hypothesize that text-based molecular representations in this and past research reflect tokenization more than learned parameters. Furthermore, we see a greater variance between in-domain predictions than out-of-domain predictions, indicating XAI measures something other than learned parameters. Finally, we investigate the relative importance given to expert-derived structural alerts and find similar importance given irregardless of applicability domain, randomization and varying training procedures. We therefore caution future research to validate their methods using a similar comparison to human intuition without further investigation. SCIENTIFIC CONTRIBUTION: In this research we critically investigate XAI through test-time augmentation, contrasting previous assumptions about using expert validation and showing inconsistencies within models for identical representations. SMILES augmentation has been used to increase model accuracy, but was here adapted from the field of image test-time augmentation to be used as an independent indication of the consistency within SMILES-based molecular representation models.

3.
SLAS Discov ; 29(2): 100144, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38316342

RESUMO

The EUOS/SLAS challenge aimed to facilitate the development of reliable algorithms to predict the aqueous solubility of small molecules using experimental data from 100 K compounds. In total, hundred teams took part in the challenge to predict low, medium and highly soluble compounds as measured by the nephelometry assay. This article describes the winning model, which was developed using the publicly available Online CHEmical database and Modeling environment (OCHEM) available on the website https://ochem.eu/article/27. We describe in detail the assumptions and steps used to select methods, descriptors and strategy which contributed to the winning solution. In particular we show that consensus based on 28 models calculated using descriptor-based and representation learning methods allowed us to obtain the best score, which was higher than those based on individual approaches or consensus models developed using each individual approach. A combination of diverse models allowed us to decrease both bias and variance of individual models and to calculate the highest score. The model based on Transformer CNN contributed the best individual score thus highlighting the power of Natural Language Processing (NLP) methods. The inclusion of information about aleatoric uncertainty would be important to better understand and use the challenge data by the contestants.


Assuntos
Algoritmos , Redes Neurais de Computação , Solubilidade , Consenso , Bases de Dados de Compostos Químicos
4.
J Chem Inf Model ; 64(1): 42-56, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38116926

RESUMO

Machine Learning (ML) techniques face significant challenges when predicting advanced chemical properties, such as yield, feasibility of chemical synthesis, and optimal reaction conditions. These challenges stem from the high-dimensional nature of the prediction task and the myriad essential variables involved, ranging from reactants and reagents to catalysts, temperature, and purification processes. Successfully developing a reliable predictive model not only holds the potential for optimizing high-throughput experiments but can also elevate existing retrosynthetic predictive approaches and bolster a plethora of applications within the field. In this review, we systematically evaluate the efficacy of current ML methodologies in chemoinformatics, shedding light on their milestones and inherent limitations. Additionally, a detailed examination of a representative case study provides insights into the prevailing issues related to data availability and transferability in the discipline.


Assuntos
Quimioinformática , Aprendizado de Máquina
5.
Pharmacol Rev ; 75(6): 1167-1199, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37684054

RESUMO

The prokineticins (PKs) were discovered approximately 20 years ago as small peptides inducing gut contractility. Today, they are established as angiogenic, anorectic, and proinflammatory cytokines, chemokines, hormones, and neuropeptides involved in variety of physiologic and pathophysiological pathways. Their altered expression or mutations implicated in several diseases make them a potential biomarker. Their G-protein coupled receptors, PKR1 and PKR2, have divergent roles that can be therapeutic target for treatment of cardiovascular, metabolic, and neural diseases as well as pain and cancer. This article reviews and summarizes our current knowledge of PK family functions from development of heart and brain to regulation of homeostasis in health and diseases. Finally, the review summarizes the established roles of the endogenous peptides, synthetic peptides and the selective ligands of PKR1 and PKR2, and nonpeptide orthostatic and allosteric modulator of the receptors in preclinical disease models. The present review emphasizes the ambiguous aspects and gaps in our knowledge of functions of PKR ligands and elucidates future perspectives for PK research. SIGNIFICANCE STATEMENT: This review provides an in-depth view of the prokineticin family and PK receptors that can be active without their endogenous ligand and exhibits "constitutive" activity in diseases. Their non- peptide ligands display promising effects in several preclinical disease models. PKs can be the diagnostic biomarker of several diseases. A thorough understanding of the role of prokineticin family and their receptor types in health and diseases is critical to develop novel therapeutic strategies with safety concerns.


Assuntos
Neoplasias , Neuropeptídeos , Humanos , Receptores Acoplados a Proteínas G/metabolismo , Neuropeptídeos/metabolismo , Peptídeos , Neoplasias/tratamento farmacológico , Biomarcadores
7.
Expert Opin Drug Discov ; 18(8): 821-833, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37424369

RESUMO

INTRODUCTION: Collaborative computing has attracted great interest in the possibility of joining the efforts of researchers worldwide. Its relevance has further increased during the pandemic crisis since it allows for the strengthening of scientific collaborations while avoiding physical interactions. Thus, the E4C consortium presents the MEDIATE initiative which invited researchers to contribute via their virtual screening simulations that will be combined with AI-based consensus approaches to provide robust and method-independent predictions. The best compounds will be tested, and the biological results will be shared with the scientific community. AREAS COVERED: In this paper, the MEDIATE initiative is described. This shares compounds' libraries and protein structures prepared to perform standardized virtual screenings. Preliminary analyses are also reported which provide encouraging results emphasizing the MEDIATE initiative's capacity to identify active compounds. EXPERT OPINION: Structure-based virtual screening is well-suited for collaborative projects provided that the participating researchers work on the same input file. Until now, such a strategy was rarely pursued and most initiatives in the field were organized as challenges. The MEDIATE platform is focused on SARS-CoV-2 targets but can be seen as a prototype which can be utilized to perform collaborative virtual screening campaigns in any therapeutic field by sharing the appropriate input files.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Simulação de Acoplamento Molecular , Proteínas , Antivirais
9.
Antibiotics (Basel) ; 11(4)2022 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-35453241

RESUMO

A previously developed model to predict antibacterial activity of ionic liquids against a resistant A. baumannii strain was used to assess activity of phosphonium ionic liquids. Their antioxidant potential was additionally evaluated with newly developed models, which were based on public data. The accuracy of the models was rigorously evaluated using cross-validation as well as test set prediction. Six alkyl triphenylphosphonium and alkyl tributylphosphonium bromides with the C8, C10, and C12 alkyl chain length were synthesized and tested in vitro. Experimental studies confirmed their activity against A. baumannii as well as showed pronounced antioxidant properties. These results suggest that phosphonium ionic liquids could be promising lead structures against A. baumannii.

10.
Int J Mol Sci ; 23(3)2022 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-35163123

RESUMO

The development of new functional materials based on porphyrins requires fast and accurate prediction of their spectral properties. The available models in the literature for absorption wavelength and extinction coefficient of the Soret band have low accuracy for this class of compounds. We collected spectral data for porphyrins to extend the literature set and compared the performance of global and local models for their modelling using different machine learning methods. Interestingly, extension of the public database contributed models with lower accuracies compared to the models, which we built using porphyrins only. The later model calculated acceptable RMSE = 2.61 for prediction of the absorption band of 335 porphyrins synthesized in our laboratory, but had a low accuracy (RMSE = 0.52) for extinction coefficient. A development of models using only compounds from our laboratory significantly decreased errors for these compounds (RMSE = 0.5 and 0.042 for absorption band and extinction coefficient, respectively), but limited their applicability only to these homologous series. When developing models, one should clearly keep in mind their potential use and select a strategy that could contribute the most accurate predictions for the target application. The models and data are publicly available.


Assuntos
Simulação por Computador , Porfirinas/química , Espectrofotometria/métodos , Modelos Moleculares , Estrutura Molecular
11.
Mol Inform ; 41(3): e2100151, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34676998

RESUMO

AlphaScreen is one of the most widely used assay technologies in drug discovery due to its versatility, dynamic range and sensitivity. However, a presence of false positives and frequent hitters contributes to difficulties with an interpretation of measured HTS data. Although filters do exist to identify frequent hitters for AlphaScreen, they are frequently based on privileged scaffolds. The development of such filters is time consuming and requires deep domain knowledge. Recently, machine learning and artificial intelligence methods are emerging as important tools to advance drug discovery and chemoinformatics, including their application to identification of frequent hitters in screening assays. However, the relative performance and complementarity of the Machine Learning and scaffold-based techniques has not yet been comprehensively compared. In this study, we analysed filters based on the privileged scaffolds with filters built using machine learning. Our results demonstrate that machine-learning methods provide more accurate filters for identification of frequent hitters in AlphaScreen assays than scaffold-based methods and can be easily redeveloped once new data are measured. We present highly accurate models to identify frequent hitters in AlphaScreen assays.


Assuntos
Ensaios de Triagem em Larga Escala , Bibliotecas de Moléculas Pequenas , Inteligência Artificial , Bioensaio , Descoberta de Drogas/métodos , Ensaios de Triagem em Larga Escala/métodos
12.
Spectrochim Acta A Mol Biomol Spectrosc ; 267(Pt 2): 120577, 2022 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-34776377

RESUMO

A possibility to accurately predict the absorption maximum wavelength of BODIPYs was investigated. We found that previously reported models had a low accuracy (40-57 nm) to predict BODIPYs due to the limited dataset sizes and/or number of BODIPYs (few hundreds). New models developed in this study were based on data of 6000-plus fluorescent dyes (including 4000-plus BODIPYs) and the deep neural network architecture. The high prediction accuracy (five-fold cross-validation room mean squared error (RMSE) of 18.4 nm) was obtained using a consensus model, which was more accurate than individual models. This model provided the excellent accuracy (RMSE of 8 nm) for molecules previously synthesized in our laboratory as well as for prospective validation of three new BODIPYs. We found that solvent properties did not significantly influence the model accuracy since only few BODIPYs exhibited solvatochromism. The analysis of large prediction errors suggested that compounds able to have intermolecular interactions with solvent or salts were likely to be incorrectly predicted. The consensus model is freely available at https://ochem.eu/article/134921 and can help the other researchers to accelerate design of new dyes with desired properties.


Assuntos
Compostos de Boro , Corantes Fluorescentes , Cristalografia por Raios X , Redes Neurais de Computação
13.
Bioorg Chem ; 114: 105042, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34120024

RESUMO

S. aureus resistant to methicillin (MRSA) is one of the most-concerned multidrug resistant bacteria, due to its role in life-threatening infections. There is an urgent need to develop new antibiotics against MRSA. In this study, we firstly compiled a data set of 2,3-diaminoquinoxalines by chemical synthesis and antibacterial screening against S. aureus, and then performed cheminformatics modeling and virtual screening. The compound with the Specs ID of AG-205/33156020 was discovered as a new antibacterial agent, and was further identified as a Gyrase B (GyrB) inhibitor. In light of the common features, we hypothesized that the 6c as the representative of 2,3-diaminoquinoxalines also inhibited GyrB and eventually proved it. Via molecular docking and molecular dynamics simulations, we identified binding modes of AG-205/33156020 and 6c to the ATPase domain of GyrB. Importantly, these GyrB inhibitors inhibited the MRSA strains and showed selectivity to HepG2 and HUVEC. Taken together, this research work provides an effective ligand-based computational workflow for scaffold hopping in anti-MRSA drug discovery, and discovers two new GyrB inhibitors that are worthy of further development.


Assuntos
Antibacterianos/farmacologia , Staphylococcus aureus Resistente à Meticilina/efeitos dos fármacos , Quinoxalinas/farmacologia , Antibacterianos/síntese química , Antibacterianos/metabolismo , Antibacterianos/toxicidade , DNA Girase/metabolismo , Avaliação Pré-Clínica de Medicamentos , Células Hep G2 , Células Endoteliais da Veia Umbilical Humana , Humanos , Ligantes , Testes de Sensibilidade Microbiana , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Ligação Proteica , Quinoxalinas/síntese química , Quinoxalinas/metabolismo , Quinoxalinas/toxicidade , Inibidores da Topoisomerase II/síntese química , Inibidores da Topoisomerase II/metabolismo , Inibidores da Topoisomerase II/farmacologia , Inibidores da Topoisomerase II/toxicidade
15.
Int J Mol Sci ; 22(2)2021 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-33429999

RESUMO

Online Chemical Modeling Environment (OCHEM) was used for QSAR analysis of a set of ionic liquids (ILs) tested against multi-drug resistant (MDR) clinical isolate Acinetobacter baumannii and Staphylococcus aureus strains. The predictive accuracy of regression models has coefficient of determination q2 = 0.66 - 0.79 with cross-validation and independent test sets. The models were used to screen a virtual chemical library of ILs, which was designed with targeted activity against MDR Acinetobacter baumannii and Staphylococcus aureus strains. Seven most promising ILs were selected, synthesized, and tested. Three ILs showed high activity against both these MDR clinical isolates.


Assuntos
Acinetobacter baumannii/efeitos dos fármacos , Infecções Bacterianas/tratamento farmacológico , Imidazóis/química , Piridinas/química , Acinetobacter baumannii/patogenicidade , Infecções Bacterianas/microbiologia , Resistência a Múltiplos Medicamentos , Humanos , Imidazóis/síntese química , Líquidos Iônicos/síntese química , Líquidos Iônicos/química , Piridinas/síntese química , Staphylococcus aureus/efeitos dos fármacos , Staphylococcus aureus/patogenicidade , Relação Estrutura-Atividade
16.
Chem Res Toxicol ; 34(2): 541-549, 2021 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-33513003

RESUMO

Selecting a model in predictive toxicology often involves a trade-off between prediction performance and explainability: should we sacrifice the model performance to gain explainability or vice versa. Here we present a comprehensive study to assess algorithm and feature influences on model performance in chemical toxicity research. We conducted over 5000 models for a Tox21 bioassay data set of 65 assays and ∼7600 compounds. Seven molecular representations as features and 12 modeling approaches varying in complexity and explainability were employed to systematically investigate the impact of various factors on model performance and explainability. We demonstrated that end points dictated a model's performance, regardless of the chosen modeling approach including deep learning and chemical features. Overall, more complex models such as (LS-)SVM and Random Forest performed marginally better than simpler models such as linear regression and KNN in the presented Tox21 data analysis. Since a simpler model with acceptable performance often also is easy to interpret for the Tox21 data set, it clearly was the preferred choice due to its better explainability. Given that each data set had its own error structure both for dependent and independent variables, we strongly recommend that it is important to conduct a systematic study with a broad range of model complexity and feature explainability to identify model balancing its predictivity and explainability.


Assuntos
Doença Hepática Induzida por Substâncias e Drogas , Aprendizado de Máquina , Preparações Farmacêuticas/química , Bases de Dados Factuais , Humanos , Modelos Moleculares , Relação Quantitativa Estrutura-Atividade
17.
J Cheminform ; 12(1): 74, 2020 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-33339533

RESUMO

The increasing volume of biomedical data in chemistry and life sciences requires development of new methods and approaches for their analysis. Artificial Intelligence and machine learning, especially neural networks, are increasingly used in the chemical industry, in particular with respect to Big Data. This editorial highlights the main results presented during the special session of the International Conference on Neural Networks organized by "Big Data in Chemistry" project and draws perspectives on the future progress of the field.

18.
Nat Commun ; 11(1): 5575, 2020 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-33149154

RESUMO

We investigated the effect of different training scenarios on predicting the (retro)synthesis of chemical compounds using text-like representation of chemical reactions (SMILES) and Natural Language Processing (NLP) neural network Transformer architecture. We showed that data augmentation, which is a powerful method used in image processing, eliminated the effect of data memorization by neural networks and improved their performance for prediction of new sequences. This effect was observed when augmentation was used simultaneously for input and the target data simultaneously. The top-5 accuracy was 84.8% for the prediction of the largest fragment (thus identifying principal transformation for classical retro-synthesis) for the USPTO-50k test dataset, and was achieved by a combination of SMILES augmentation and a beam search algorithm. The same approach provided significantly better results for the prediction of direct reactions from the single-step USPTO-MIT test set. Our model achieved 90.6% top-1 and 96.1% top-5 accuracy for its challenging mixed set and 97% top-5 accuracy for the USPTO-MIT separated set. It also significantly improved results for USPTO-full set single-step retrosynthesis for both top-1 and top-10 accuracies. The appearance frequency of the most abundantly generated SMILES was well correlated with the prediction outcome and can be used as a measure of the quality of reaction prediction.

19.
Chem Soc Rev ; 49(11): 3525-3564, 2020 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-32356548

RESUMO

Prediction of chemical bioactivity and physical properties has been one of the most important applications of statistical and more recently, machine learning and artificial intelligence methods in chemical sciences. This field of research, broadly known as quantitative structure-activity relationships (QSAR) modeling, has developed many important algorithms and has found a broad range of applications in physical organic and medicinal chemistry in the past 55+ years. This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed in QSAR to a wide range of research areas outside of traditional QSAR boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics. As modern research methods generate rapidly increasing amounts of data, the knowledge of robust data-driven modelling methods professed within the QSAR field can become essential for scientists working both within and outside of chemical research. We hope that this contribution highlighting the generalizable components of QSAR modeling will serve to address this challenge.


Assuntos
Química Farmacêutica/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/metabolismo , Preparações Farmacêuticas/química , Algoritmos , Animais , Inteligência Artificial , Bases de Dados Factuais , Desenho de Fármacos , História do Século XX , História do Século XXI , Humanos , Modelos Moleculares , Relação Quantitativa Estrutura-Atividade , Teoria Quântica , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...