Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Mol Inform ; 42(4): e2200208, 2023 04.
Article in English | MEDLINE | ID: mdl-36604304

ABSTRACT

In order to analyze the Chimiothèque Nationale (CN) - The French National Compound Library - in the context of screening and biologically relevant compounds, the library was compared with ZINC in-stock collection and ChEMBL. This includes the study of chemical space coverage, physicochemical properties and Bemis-Murcko (BM) scaffold populations. More than 5 K CN-unique scaffolds (relative to ZINC and ChEMBL collections) were identified. Generative Topographic Maps (GTMs) accommodating those libraries were generated and used to compare the compound populations. Hierarchical GTM («zooming¼) was applied to generate an ensemble of maps at various resolution levels, from global overview to precise mapping of individual structures. The respective maps were added to the ChemSpace Atlas website. The analysis of synthetic accessibility in the context of combinatorial chemistry showed that only 29,7 % of CN compounds can be fully synthesized using commercially available building blocks.


Subject(s)
Databases, Chemical
2.
J Chem Inf Model ; 62(22): 5471-5484, 2022 11 28.
Article in English | MEDLINE | ID: mdl-36332178

ABSTRACT

In order to better foramize it, the notorious inverse-QSAR problem (finding structures of given QSAR-predicted properties) is considered in this paper as a two-step process including (i) finding "seed" descriptor vectors corresponding to user-constrained QSAR model output values and (ii) identifying the chemical structures best matching the "seed" vectors. The main development effort here was focused on the latter stage, proposing a new attention-based conditional variational autoencoder neural-network architecture based on recent developments in attention-based methods. The obtained results show that this workflow was capable of generating compounds predicted to display desired activity while being completely novel compared to the training database (ChEMBL). Moreover, the generated compounds show acceptable druglikeness and synthetic accessibility. Both pharmacophore and docking studies were carried out as "orthogonal" in silico validation methods, proving that some of de novo structures are, beyond being predicted active by 2D-QSAR models, clearly able to match binding 3D pharmacophores and bind the protein pocket.


Subject(s)
Quantitative Structure-Activity Relationship , Molecular Docking Simulation
3.
J Chem Inf Model ; 62(18): 4537-4548, 2022 09 26.
Article in English | MEDLINE | ID: mdl-36103300

ABSTRACT

Nowadays, drug discovery is inevitably intertwined with the usage of large compound collections. Understanding of their chemotype composition and physicochemical property profiles is of the highest importance for successful hit identification. Efficient polyfunctional tools allowing multifaceted analysis of constantly growing chemical libraries must be Big Data-compatible. Here, we present the freely accessible ChemSpace Atlas (https://chematlas.chimie.unistra.fr), which includes almost 40K hierarchically organized Generative Topographic Maps (GTM) accommodating up to 500 M compounds covering fragment-like, lead-like, drug-like, PPI-like, and NP-like chemical subspaces. They allow users to navigate and analyze ZINC, ChEMBL, and COCONUT from multiple perspectives on different scales: from a bird's eye view of the entire library to structural pattern detection in small clusters. Around 20 physicochemical properties and almost 750 biological activities can be visualized (associated with map zones), supporting activity profiling and analogue search. Moreover, ChemScape Atlas will be extended toward new chemical subspaces (e.g., DNA-encoded libraries and synthons) and functionalities (ADMETox profiling and property-guided de novo compound generation).


Subject(s)
Drug Discovery , Small Molecule Libraries , DNA/chemistry , Gene Library , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Zinc
4.
Int J Mol Sci ; 23(11)2022 May 30.
Article in English | MEDLINE | ID: mdl-35682792

ABSTRACT

Molecular similarity is an impressively broad topic with many implications in several areas of chemistry. Its roots lie in the paradigm that 'similar molecules have similar properties'. For this reason, methods for determining molecular similarity find wide application in pharmaceutical companies, e.g., in the context of structure-activity relationships. The similarity evaluation is also used in the field of chemical legislation, specifically in the procedure to judge if a new molecule can obtain the status of orphan drug with the consequent financial benefits. For this procedure, the European Medicines Agency uses experts' judgments. It is clear that the perception of the similarity depends on the observer, so the development of models to reproduce the human perception is useful. In this paper, we built models using both 2D fingerprints and 3D descriptors, i.e., molecular shape and pharmacophore descriptors. The proposed models were also evaluated by constructing a dataset of pairs of molecules which was submitted to a group of experts for the similarity judgment. The proposed machine-learning models can be useful to reduce or assist human efforts in future evaluations. For this reason, the new molecules dataset and an online tool for molecular similarity estimation have been made freely available.


Subject(s)
Machine Learning , Receptors, Drug , Humans , Perception , Structure-Activity Relationship
5.
ACS Cent Sci ; 8(6): 804-813, 2022 Jun 22.
Article in English | MEDLINE | ID: mdl-35756377

ABSTRACT

Dynamic combinatorial libraries (DCLs) display adaptive behavior, enabled by the reversible generation of their molecular constituents from building blocks, in response to external effectors, e.g., protein receptors. So far, chemoinformatics has not yet been used for the design of DCLs-which comprise a radically different set of challenges compared to classical library design. Here, we propose a chemoinformatic model for theoretically assessing the composition of DCLs in the presence and the absence of an effector. An imine-based DCL in interaction with the effector human carbonic anhydrase II (CA II) served as a case study. Support vector regression models for the imine formation constants and imine-CA II binding were derived from, respectively, a set of 276 imines synthesized and experimentally studied in this work and 4350 inhibitors of CA II from ChEMBL. These models predict constants for all DCL constituents, to feed software assessing equilibrium concentrations. They are publicly available on the dedicated website. Models rationally selected two amines and two aldehydes predicted to yield stable imines with high affinity for CA II and provided a virtual illustration on how effector affinity regulates DCL members.

6.
Mol Inform ; 40(9): e2100068, 2021 09.
Article in English | MEDLINE | ID: mdl-34170632

ABSTRACT

Natural products (NPs), being evolutionary selected over millions of years to bind to biological macromolecules, remained an important source of inspiration for medicinal chemists even after the advent of efficient drug discovery technologies such as combinatorial chemistry and high-throughput screening. Thus, there is a strong demand for efficient and user-friendly computational tools that allow to analyze large libraries of NPs. In this context, we introduce NP Navigator - a freely available intuitive online tool for visualization and navigation through the chemical space of NPs and NP-like molecules. It is based on the hierarchical ensemble of generative topographic maps, featuring NPs from the COlleCtion of Open NatUral producTs (COCONUT), bioactive compounds from ChEMBL and commercially available molecules from ZINC. NP Navigator allows to efficiently analyze different aspects of NPs - chemotype distribution, physicochemical properties, biological activity and commercial availability of NPs. The latter concerns not only purchasable NPs but also their close analogs that can be considered as synthetic mimetics of NPs or pseudo-NPs.


Subject(s)
Biological Products , Combinatorial Chemistry Techniques , Macromolecular Substances/analysis , Zinc/chemistry
7.
Chemistry ; 21(33): 11681-6, 2015 Aug 10.
Article in English | MEDLINE | ID: mdl-26179867

ABSTRACT

In the context of designing novel amino acid nanostructures, the capacity of tyrosine alone to form well-ordered structures under different conditions was explored. It was observed that Tyr can self-assemble into well-defined morphologies when deposited onto surfaces for transmission electron microscopy, atomic force microscopy, and scanning electron microscopy. The influence of various parameters that can modulate the self-assembly process, including concentration of the amino acid, aging time, and solvent, was studied. Different supramolecular architectures, including nanoribbons, branched structures, and fern-like arrangements were also observed.


Subject(s)
Amino Acids/chemistry , Nanostructures/chemistry , Tyrosine/chemistry , Microscopy, Electron, Scanning , Solvents/chemistry
8.
Bioorg Med Chem ; 20(18): 5396-409, 2012 Sep 15.
Article in English | MEDLINE | ID: mdl-22595424

ABSTRACT

While self-organizing maps (SOM) have often been used to map and describe chemical space, this paper focuses on their use to accelerate similarity searches based on vectors of high-dimensional real-value descriptors for which classical, binary fingerprint-based similarity speed-up procedures do not apply. Fuzzy tricentric pharmacophore (FPT) and ISIDA substructure counts are herein explored examples. Similarity search speed-up was achieved by positioning compounds on a SOM, then searching for analogues only in the neurons neighbouring the ones in which the query compounds reside. Smaller neighbourhood means shorter virtual screening (VS) time, but lower analogue retrieval rates. An enhancement criterion, conciliating the opposite trends is defined. It depends on map definition and build-up protocol (training set choice, map size, convergence criteria,…). The main goal is to discover and validate SOMs of optimal quality with respect to this criterion. Increasing the size of the training set beyond a certain limit is shown to be unnecessary and even detrimental, suggesting that one SOM built on a relatively small but diverse training set may be an effective VS enhancer of a much larger database. Also, using an excessively large number of training iterations may lead to over-fitting. Gradual training with en-route checking of VS enhancement propensity is the best strategy to follow. Maps were successfully challenged to accelerate the large-scale VS of 12,000 queries against 160,000 compounds, and shown to provide a meaningful mapping of activity-annotated compounds in chemical space.


Subject(s)
Databases, Chemical , Drug Discovery , High-Throughput Screening Assays , Reproducibility of Results , Software
9.
Biochemistry ; 49(22): 4679-86, 2010 Jun 08.
Article in English | MEDLINE | ID: mdl-20423153

ABSTRACT

Debio 025 is a cyclosporin A (CsA) analogue that interferes strongly with the hepatitis C viral life cycle. Compared to CsA, Debio 025 has an additional methyl group at position 3 of the cyclic undecapeptide and an N-ethylvaline instead of an N-methylleucine at position 4. Unlike CsA, Debio 025 lacks immunosuppressive activity in vitro and in vivo. We show here that, in vitro, the cyclophilin A (CypA)-Debio 025 complex cannot interact any longer with calcineurin (CaN), a determinant for the immunosuppressive activity of CsA. We further use NMR spectroscopy to investigate at the molecular level the interaction of Debio 025 with CypA and thereby understand the basis for this loss of CaN interaction. NMR data and molecular modeling indicate that Debio 025 optimally interacts with CypA, which underlies the anti-HCV properties of Debio 025. However, the interaction between CaN and the CypA-Debio 025 complex is impeded by sterical hindrance of the CaN with the side chain of its Val4 residue. This is in sharp contrast with the case for the CypA-CsA-CaN ternary complex, where the Leu4 side chain can enter a hydrophobic cavity at the CaN interface. The structure of the CypA-Debio 025 complex thus provides a rational explanation for the non-immunosuppressive character of Debio 025.


Subject(s)
Cyclosporine/chemistry , Immunosuppressive Agents/chemistry , Antiviral Agents/chemistry , Antiviral Agents/metabolism , Calcineurin/metabolism , Cyclosporine/metabolism , Drug Interactions , Hepacivirus/drug effects , Hepacivirus/immunology , Humans , Immunosuppressive Agents/metabolism , Leucine/analogs & derivatives , Leucine/chemistry , Leucine/metabolism , Magnetic Resonance Spectroscopy , Protein Binding , Sequence Homology, Amino Acid , Valine/analogs & derivatives , Valine/chemistry , Valine/metabolism , Virus Replication/drug effects , Virus Replication/immunology
10.
J Biomol NMR ; 43(4): 219-27, 2009 Apr.
Article in English | MEDLINE | ID: mdl-19288066

ABSTRACT

Adding the 13C labelled 2-keto-isovalerate and 2-oxobutanoate precursors to a minimal medium composed of 12C labelled glucose instead of the commonly used (2D, 13C) glucose leads not only to the 13C labelling of (I, L, V) methyls but also to the selective 13C labelling of the backbone C(alpha) and CO carbons of the Ile and Val residues. As a result, the backbone (1H, 15N) correlations of the Ile and Val residues and their next neighbours in the (i + 1) position can be selectively identified in HN(CA) and HN(CO) planes. The availability of a selective HSQC spectrum corresponding to the sole amide resonances of the Ile and Val residues allows connecting them to their corresponding methyls by the intra-residue NOE effect, and should therefore be applicable to larger systems.


Subject(s)
Isoleucine/chemistry , Isotope Labeling/methods , Leucine/chemistry , Nuclear Magnetic Resonance, Biomolecular/methods , Proteins/chemistry , Valine/chemistry , Carbon Isotopes/chemistry , Cyclophilins/chemistry , Methylation
11.
Carbohydr Res ; 344(3): 322-30, 2009 Feb 17.
Article in English | MEDLINE | ID: mdl-19084822

ABSTRACT

(1)H NMR is now a standard method to determine de novo primary sequence of all sorts of glycans. These last 30 years, tens of thousands of oligosaccharide sequences have been elucidated by NMR spectroscopy in conjunction with other physico-chemical methods including mass spectrometry and gas chromatography. Most of these sequences are now compiled and available in several web databases recently unified in publicly available GlycomeDB, along with sets of experimental data. However, because the search for an exact sequence exclusively based on proton chemical shifts is sometimes delicate for NMR non-specialists, we worked out a new type of query, named SOACS, which allows the easy retrieval of existing sequences. This query is based on the readily distinguished (1)H chemical shifts from any (1)H NMR spectrum, and was designed to be usable to the widest scientist community.


Subject(s)
Information Storage and Retrieval , Polysaccharides/chemistry , Animals , Carbohydrate Sequence , Database Management Systems , Databases, Factual , Magnetic Resonance Spectroscopy , Molecular Sequence Data
12.
J Chem Inf Model ; 48(2): 409-25, 2008 Feb.
Article in English | MEDLINE | ID: mdl-18254617

ABSTRACT

Topological fuzzy pharmacophore triplets (2D-FPT), using the number of interposed bonds to measure separation between the atoms representing pharmacophore types, were employed to establish and validate quantitative structure-activity relationships (QSAR). Thirteen data sets for which state-of-the-art QSAR models were reported in literature were revisited in order to benchmark 2D-FPT biological activity-explaining propensities. Linear and nonlinear QSAR models were constructed for each compound series (following the original author's splitting into training/validation subsets) with three different 2D-FPT versions, using the genetic algorithm-driven Stochastic QSAR sampler (SQS) to pick relevant triplets and fit their coefficients. 2D-FPT QSARs are computationally cheap, interpretable, and perform well in benchmarking. In a majority of cases (10/13), default 2D-FPT models validated better than or as well as the best among those reported, including 3D overlay-dependent approaches. Most of the analogues series, either unaffected by protonation equilibria or unambiguously adopting expected protonation states, were equally well described by rule- or pKa-based pharmacophore flagging. Thermolysin inhibitors represent a notable exception: pKa-based flagging boosts model quality, although--surprisingly--not due to proteolytic equilibrium effects. The optimal degree of 2D-FPT fuzziness is compound set dependent. This work further confirmed the higher robustness of nonlinear over linear SQS models. In spite of the wealth of studied sets, benchmarking is nevertheless flawed by low intraset diversity: a whole series of thereby caused artifacts were evidenced, implicitly raising questions about the way QSAR studies are conducted nowadays. An in-depth investigation of thrombin inhibition models revealed that some of the selected triplets make sense (one of these stands for a topological pharmacophore covering the P1 and P2 binding pockets). Nevertheless, equations were either unable to predict the activity of the structurally different ligands or tended to indiscriminately predict any compound outside the training family to be active. 2D-FPT QSARs do however not depend on any common scaffold required for molecule superimposition and may in principle be trained on hand of diverse sets, which is a must in order to obtain widely applicable models. Adding (assumed) inactives of various families for training enabled discovery of models that specifically recognize the structurally different actives.


Subject(s)
Quantitative Structure-Activity Relationship , Binding Sites , Chemistry, Pharmaceutical , Models, Chemical , Thermolysin/antagonists & inhibitors , Thrombin/antagonists & inhibitors
13.
J Chem Inf Model ; 47(3): 927-39, 2007.
Article in English | MEDLINE | ID: mdl-17480052

ABSTRACT

Descriptor selection in QSAR typically relies on a set of upfront working hypotheses in order to boil down the initial descriptor set to a tractable size. Stepwise regression, computationally cheap and therefore widely used in spite of its potential caveats, is most aggressive in reducing the effectively explored problem space by adopting a greedy variable pick strategy. This work explores an antipodal approach, incarnated by an original Genetic Algorithm (GA)-based Stochastic QSAR Sampler (SQS) that favors unbiased model search over computational cost. Independent of a priori descriptor filtering and, most important, not limited to linear models only, it was benchmarked against the ISIDA Stepwise Regression (SR) tool. SQS was run under various premises, varying the training/validation set splitting scheme, the nonlinearity policy, and the used descriptors. With the considered three anti-HIV compound sets, repeated SQS runs generate sometimes poorly overlapping but nevertheless equally well validating model sets. Enabling SQS to apply nonlinear descriptor transformations increases the problem space: nevertheless, nonlinear models tend to be more robust validators. Model validation benchmarking showed SQS to match the performance of SR or outperform it in cases when the upfront simplifications of SR "backfire", even though the robust SR got trapped in local minima only once in six cases. Consensus models from large SQS model sets validate well--but not outstandingly better than SR consensus equations. SQS is thus a robust QSAR building tool according to standard validation tests against external sets of compounds (of same families as used for training), but many of its benefits/drawbacks may yet not be revealed by such tests. SQS results are a challenge to the traditional way to interpret and exploit QSAR: how to deal with thousands of well validating models, nonetheless providing potentially diverging applicability ranges and predicted values for external compounds. SR does not impose such burden on the user, but is "betting" on a single equation or a narrow consensus model to behave properly in virtual screening a sound strategy? By posing these questions, this article will hopefully act as an incentive for the long-haul studies needed to get them answered.


Subject(s)
Models, Biological , Quantitative Structure-Activity Relationship , Stochastic Processes , Algorithms , Computer Simulation , Reproducibility of Results
14.
J Chem Inf Model ; 46(6): 2457-77, 2006.
Article in English | MEDLINE | ID: mdl-17125187

ABSTRACT

This paper introduces a novel molecular description--topological (2D) fuzzy pharmacophore triplets, 2D-FPT--using the number of interposed bonds as the measure of separation between the atoms representing pharmacophore types (hydrophobic, aromatic, hydrogen-bond donor and acceptor, cation, and anion). 2D-FPT features three key improvements with respect to the state-of-the-art pharmacophore fingerprints: (1) The first key novelty is fuzzy mapping of molecular triplets onto the basis set of pharmacophore triplets: unlike in the binary scheme where an atom triplet is set to highlight the bit of a single, best-matching basis triplet, the herein-defined fuzzy approach allows for gradual mapping of each atom triplet onto several related basis triplets, thus minimizing binary classification artifacts. (2) The second innovation is proteolytic equilibrium dependence, by explicitly considering all of the conjugated acids and bases (microspecies). 2D-FPTs are concentration-weighted (as predicted at pH=7.4) averages of microspecies fingerprints. Therefore, small structural modifications, not affecting the overall pharmacophore pattern (in the sense of classical rule-based assignment), but nevertheless triggering a pKa shift, will have a major impact on 2D-FPT. Pairs of almost identical compounds with significantly differing activities ("activity cliffs" in classical descriptor spaces) were in many cases predictable by 2D-FPT. (3) The third innovation is a new similarity scoring formula, acknowledging that the simultaneous absence of a triplet in two molecules is a less-constraining indicator of similarity than its simultaneous presence. It displays excellent neighborhood behavior, outperforming 2D or 3D two-point pharmacophore descriptors or chemical fingerprints. The 2D-FPT calculator was developed using the chemoinformatics toolkit of ChemAxon (www.chemaxon.com).


Subject(s)
Chemistry, Pharmaceutical/methods , Drug Industry/methods , Algorithms , Combinatorial Chemistry Techniques , Drug Design , Drug Evaluation, Preclinical , Hydrogen-Ion Concentration , Informatics , Internet , Ligands , Models, Chemical , Models, Molecular , Models, Statistical , Models, Theoretical , Molecular Conformation , Pharmaceutical Preparations
SELECTION OF CITATIONS
SEARCH DETAIL
...