Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Front Pharmacol ; 13: 832120, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35359835

RESUMO

In drug discovery, molecules are optimized towards desired properties. In this context, machine learning is used for extrapolation in drug discovery projects. The limits of extrapolation for regression models are known. However, a systematic analysis of the effectiveness of extrapolation in drug discovery has not yet been performed. In response, this study examined the capabilities of six machine learning algorithms to extrapolate from 243 datasets. The response values calculated from the molecules in the datasets were molecular weight, cLogP, and the number of sp3-atoms. Three experimental set ups were chosen for response values. Shuffled data were used for interpolation, whereas data for extrapolation were sorted from high to low values, and the reverse. Extrapolation with sorted data resulted in much larger prediction errors than extrapolation with shuffled data. Additionally, this study demonstrated that linear machine learning methods are preferable for extrapolation.

2.
J Cheminform ; 11(1): 53, 2019 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-31392432

RESUMO

The Platinum dataset of protein-bound ligand conformations was used to benchmark the ability of the MMFF94s force field to generate bioactive conformations by minimization of randomly generated conformers. Torsion angle parameters that generally caused wrong geometries were reparameterized by conducting dihedral scans using ab initio calculations at the MP2 level. This reparameterization resulted in a systematic improvement of generated conformations.

3.
Sci Rep ; 9(1): 967, 2019 01 30.
Artigo em Inglês | MEDLINE | ID: mdl-30700728

RESUMO

Molecular complexity is an important characteristic of organic molecules for drug discovery. How to calculate molecular complexity has been discussed in the scientific literature for decades. It was known from early on that the numbers of substructures that can be cut out of a molecular graph are of importance for this task. However, it was never realized that the cut-out substructures show self-similarity to the parent structures. A successive removal of one bond and one atom returns a series of fragments with decreasing size. Such a series shows self-similarity similar to fractal objects. Here we used the number of distinct fragments to calculate the fractal dimension of the molecule. The fractal dimension of a molecule is a new matter constant that incorporates all features that are currently known to be important for describing molecular complexity. Furthermore, this is the first work that reveals the fractal nature of organic molecules.

4.
Chimia (Aarau) ; 71(10): 667-677, 2017 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-29070412

RESUMO

In this case study on an essential instrument of modern drug discovery, we summarize our successful efforts in the last four years toward enhancing the Actelion screening compound collection. A key organizational step was the establishment of the Compound Library Committee (CLC) in September 2013. This cross-functional team consisting of computational scientists, medicinal chemists and a biologist was endowed with a significant annual budget for regular new compound purchases. Based on an initial library analysis performed in 2013, the CLC developed a New Library Strategy. The established continuous library turn-over mode, and the screening library size of 300'000 compounds were maintained, while the structural library quality was increased. This was achieved by shifting the selection criteria from 'druglike' to 'leadlike' structures, enriching for non-flat structures, aiming for compound novelty, and increasing the ratio of higher cost 'Premium Compounds'. Novel chemical space was gained by adding natural compounds, macrocycles, designed and focused libraries to the collection, and through mutual exchanges of proprietary compounds with agrochemical companies. A comparative analysis in 2016 provided evidence for the positive impact of these measures. Screening the improved library has provided several highly promising hits, including a macrocyclic compound, that are currently followed up in different Hit-to-Lead and Lead Optimization programs. It is important to state that the goal of the CLC was not to achieve higher HTS hit rates, but to increase the chances of identified hits to serve as the basis of successful early drug discovery programs. The experience gathered so far legitimates the New Library Strategy.


Assuntos
Descoberta de Drogas , Avaliação Pré-Clínica de Medicamentos , Algoritmos , Ensaios de Triagem em Larga Escala , Bibliotecas de Moléculas Pequenas
5.
Pac Symp Biocomput ; 22: 312-323, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27896985

RESUMO

A new computational method is presented to extract disease patterns from heterogeneous and text-based data. For this study, 22 million PubMed records were mined for co-occurrences of gene name synonyms and disease MeSH terms. The resulting publication counts were transferred into a matrix Mdata. In this matrix, a disease was represented by a row and a gene by a column. Each field in the matrix represented the publication count for a co-occurring disease-gene pair. A second matrix with identical dimensions Mrelevance was derived from Mdata. To create Mrelevance the values from Mdata were normalized. The normalized values were multiplied by the column-wise calculated Gini coefficient. This multiplication resulted in a relevance estimator for every gene in relation to a disease. From Mrelevance the similarities between all row vectors were calculated. The resulting similarity matrix Srelevance related 5,000 diseases by the relevance estimators calculated for 15,000 genes. Three diseases were analyzed in detail for the validation of the disease patterns and the relevant genes. Cytoscape was used to visualize and to analyze Mrelevance and Srelevance together with the genes and diseases. Summarizing the results, it can be stated that the relevance estimator introduced here was able to detect valid disease patterns and to identify genes that encoded key proteins and potential targets for drug discovery projects.


Assuntos
Biologia Computacional/métodos , Doença/genética , Descoberta de Drogas , Algoritmos , Mineração de Dados/métodos , Diabetes Mellitus Tipo 2/tratamento farmacológico , Diabetes Mellitus Tipo 2/genética , Tratamento Farmacológico , Redes Reguladoras de Genes , Humanos , Melanoma/tratamento farmacológico , Melanoma/genética , PubMed , Neoplasias Cutâneas/tratamento farmacológico , Neoplasias Cutâneas/genética , Vitiligo/tratamento farmacológico , Vitiligo/genética
6.
J Chem Inf Model ; 55(2): 460-73, 2015 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-25558886

RESUMO

Drug discovery projects in the pharmaceutical industry accumulate thousands of chemical structures and ten-thousands of data points from a dozen or more biological and pharmacological assays. A sufficient interpretation of the data requires understanding, which molecular families are present, which structural motifs correlate with measured properties, and which tiny structural changes cause large property changes. Data visualization and analysis software with sufficient chemical intelligence to support chemists in this task is rare. In an attempt to contribute to filling the gap, we released our in-house developed chemistry aware data analysis program DataWarrior for free public use. This paper gives an overview of DataWarrior's functionality and architecture. Exemplarily, a new unsupervised, 2-dimensional scaling algorithm is presented, which employs vector-based or nonvector-based descriptors to visualize the chemical or pharmacophore space of even large data sets. DataWarrior uses this method to interactively explore chemical space, activity landscapes, and activity cliffs.


Assuntos
Descoberta de Drogas/métodos , Software , Algoritmos , Inteligência Artificial , Técnicas de Química Combinatória , Apresentação de Dados , Mineração de Dados , Bases de Dados de Compostos Químicos , Indústria Farmacêutica/métodos , Modelos Moleculares , Conformação Molecular , Linguagens de Programação , Relação Estrutura-Atividade , Máquina de Vetores de Suporte
7.
J Chem Inf Model ; 52(2): 380-90, 2012 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-22251316

RESUMO

A new subpharmacophore-based virtual screening method is introduced. Subpharmacophores are derived from large active molecules to detect small bioactive molecules as seeds for starting points in medicinal chemistry programs. A large data set was assembled from the ChEMBL database to check the validity of this approach. Molecules for 133 targets with molecular weights between 450 and 850 were selected as queries. For the query molecules, the pharmacophore descriptors were calculated. Up to 56 000 subpharmacophore descriptors with five to seven pharmacophore points were derived from the query pharmacophores. The subpharmacophore descriptors were used as queries to screen 1079 test data sets, containing decoys and spike molecules. A maximum upper molecular weight limit of 400 Da was set for the test molecules. Three different chemical fingerprint descriptors were used for comparison purposes. The subpharmacophore approach detected active molecules for 85 out of 133 targets and outperformed the chemical fingerprints. This ligand-based virtual screening experiment was triggered by the needs of medicinal chemistry. Applying the subpharmacophore method in a medicinal chemistry program, where a lead molecule with a molecular weight of 800 Da was available, resulted in a new series of molecules with molecular weights below 400.


Assuntos
Simulação por Computador , Bases de Dados Factuais , Avaliação Pré-Clínica de Medicamentos/métodos , Química Farmacêutica , Peso Molecular
8.
Expert Opin Drug Discov ; 6(2): 103-7, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22647131

RESUMO

Grid computing offers an opportunity to gain massive computing power at low costs. We give a short introduction into the drug discovery process and exemplify the use of grid computing for image processing, docking and 3D pharmacophore descriptor calculations. The principle of a grid and its architecture are briefly explained. More emphasis is laid on the issues related to a company-wide grid installation and embedding the grid into the research process. The future of grid computing in drug discovery is discussed in the expert opinion section. Most needed, besides reliable algorithms to predict compound properties, is embedding the grid seamlessly into the discovery process. User friendly access to powerful algorithms without any restrictions, that is, by a limited number of licenses, has to be the goal of grid computing in drug discovery.

9.
J Chem Inf Model ; 49(2): 232-46, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19434825

RESUMO

We present OSIRIS, an entirely in-house developed drug discovery informatics system. Its components cover all information handling aspects from compound synthesis via biological testing to preclinical development. Its design principles are platform and vendor independence, a consistent look and feel, and complete coverage of the drug discovery process by custom tailored applications. These include electronic laboratory notebook applications for biology and chemistry, tools for high-throughput and secondary screening evaluation, chemistry-aware data visualization, physicochemical property prediction, 3D-pharmacophore comparisons, interactive modeling, computing grid based ligand-protein docking, and more. Most applications are developed in Java and are built on top of a Java library layer that provides reusable cheminformatics functionality and GUI components such as chemical editors, structure canonicalization, substructure search, combinatorial enumeration, enhanced stereo perception, force field minimization, and conformation generation.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos , Modelos Moleculares
10.
J Chem Inf Model ; 49(2): 209-31, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19434824

RESUMO

Several in-house developed descriptors and our in-house docking tool ActDock were compared with virtual screening on the data set of useful decoys (DUD). The results were compared with the chemical fingerprint descriptor from ChemAxon and with the docking results of the original DUD publication. The DUD is the first published data set providing active molecules, decoys, and references for crystal structures of ligand-target complexes. The DUD was designed for the purpose of evaluating docking programs. It contains 2950 active compounds against a total of 40 target proteins. Furthermore, for every ligand the data set contains 36 structurally dissimilar decoy compounds with similar physicochemical properties. We extracted the ligands from the target proteins to extend the applicability of the data set to include ligand based virtual screening. From the 40 target proteins, 37 contained ligands that we used as query molecules for virtual screening evaluation. With this data set a large comparison was done between four different chemical fingerprints, a topological pharmacophore descriptor, the Flexophore descriptor, and ActDock. The Actelion docking tool relies on a MM2 forcefield and a pharmacophore point interaction statistic for scoring; the details are described in this publication. In terms of enrichment rates the chemical fingerprint descriptors performed better than the Flexophore and the docking tool. After removing molecules chemically similar to the query molecules the Flexophore descriptor outperformed the chemical descriptors and the topological pharmacophore descriptors. With the similarity matrix calculations used in this study it was shown that the Flexophore is well suited to find new chemical entities via "scaffold hopping". The Flexophore descriptor can be explored with a Java applet at http://www.cheminformatics.ch in the submenu Tools-->Flexophore. Its usage is free of charge and does not require registration.


Assuntos
Estrutura Molecular , Adenosina Desaminase/química , Adenosina Desaminase/metabolismo , Ligantes , Receptores de Estrogênio/química , Receptores de Estrogênio/metabolismo , Timidina Quinase/química , Timidina Quinase/metabolismo , Proteínas Quinases p38 Ativadas por Mitógeno/química , Proteínas Quinases p38 Ativadas por Mitógeno/metabolismo
11.
J Chem Inf Model ; 48(4): 797-810, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18393490

RESUMO

A novel pharmacophore descriptor Flexophore is presented, which considers molecular flexibility when comparing descriptor similarities. The descriptor is a complete reduced graph of the underlying molecule. Its nodes are represented by enhanced MM2 atom types, while the edge descriptions encode the molecular flexibility by means of a histogram of node distances in a diverse conformer distribution. For comparing two descriptor nodes, a statistical function derived from the Cambridge Crystallographic Database is implemented. To assess the capability of the descriptor to describe the bioactivity space, 350 test data sets with 1000 molecules each are compiled. The data sets were spiked with molecules active on one of 18 different targets. In 175 of the 350 data sets, all molecules chemically similar to the query molecules were removed. Virtual screening on these data sets showed that the Flexophore descriptor detects active molecules despite chemical dissimilarity, whereas the results for the screening of the complete data sets show enrichments comparable to chemical fingerprint descriptors. The diversity analysis of the enriched compounds demonstrates that the Flexophore descriptor describes the chemical space orthogonal to chemical fingerprint descriptors.


Assuntos
Preparações Farmacêuticas/química , Conformação Molecular
12.
J Chem Inf Model ; 46(4): 1580-7, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16859289

RESUMO

Principal component analysis and self-organizing maps (SOMs) were compared to cluster and visualize the chemical space of a large and diverse data set. The data set comprised about 3000 G-protein-coupled receptor (GPCR) ligands for about 130 receptors and 3000 non-GPCR ligands from the World Drug Index. The molecules were described with a topological pharmacophore point histogram descriptor and a chemical fingerprint descriptor. To assess the predictive power of the clustering, a leave-multiple-out cross validation with k nearest neighbor classification was performed. The results of the classification tests and the visualization showed a clear superiority of the SOM method. SOM correctly divided the data set into two main clusters, one for the GPCR and the other for the non-GPCR ligands. Our results suggest that a continuous GPCR-ligand space exists.


Assuntos
Receptores Acoplados a Proteínas G/efeitos dos fármacos , Desenho de Fármacos
13.
J Chem Inf Model ; 46(2): 536-44, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16562981

RESUMO

We describe a toxicity alerting system for uncharacterized compounds, which is based upon comprehensive tables of substructure fragments that are indicative of toxicity risk. These tables were derived computationally by analyzing the RTECS database and the World Drug Index. We provide, free of charge, a Java applet for structure drawing and toxicity risk assessment. In an independent investigation, we compared the toxicity classification performance of naive Bayesian clustering, k next neighbor classification, and support vector machines. To visualize the chemical space of both toxic and druglike molecules, we trained a large self-organizing map (SOM) with all compounds from the RTECS database and the IDDB. In summary, we found that a support vector machine performed best at classifying compounds of defined toxicity into appropriate toxicity classes. Also, SOMs performed excellently in separating toxic from nontoxic substances. Although these two methods are limited to compounds that are structurally similar to known toxic substances, our fragment-based approach extends predictions to compounds that are structurally dissimilar to compounds used in the training set.


Assuntos
Desenho de Fármacos , Modelos Teóricos , Relação Estrutura-Atividade , Testes de Toxicidade , Teorema de Bayes , Bases de Dados como Assunto/estatística & dados numéricos , Estrutura Molecular , Medição de Risco , Software , Testes de Toxicidade/estatística & dados numéricos
14.
J Chem Inf Comput Sci ; 44(3): 1137-47, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15154783

RESUMO

The goal of our work was to differentiate between patterns, which are responsible for the activity of small molecular ligands binding to G-protein coupled receptors (GPCRs) and molecules, which are pharmacologically active on other target classes. Second the aim was to go one step further and analyze the chemical space occupied by GPCR active ligands itself, to distinguish between the actives of different subclasses or even cluster ligands for single receptors. To achieve these objectives, we have built a database of small, organic molecules, which bind to GPCRs. Once this crucial foundation for pattern recognition has been laid, we needed to find a descriptor, which is able to detect the compulsory features responsible for activity within a molecule. In this matter we found that the well accepted pharmacophore descriptor served us well. Finally we needed to find a method to display the clustering or separation of the specific ligands. We found that self-organizing maps (SOMs) perform excellently in this task. We herein present the analysis of the chemical space of active compounds, depending on their biological target, the GPCRs. We will also discuss the techniques used to create the chemical spaces. The findings can be applied and have an impact at various stages of the drug discovery process.


Assuntos
Receptores Acoplados a Proteínas G/química , Ligantes , Reconhecimento Automatizado de Padrão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...