Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Sci Data ; 11(1): 742, 2024 Jul 07.
Article in English | MEDLINE | ID: mdl-38972891

ABSTRACT

We here introduce the Aquamarine (AQM) dataset, an extensive quantum-mechanical (QM) dataset that contains the structural and electronic information of 59,783 low-and high-energy conformers of 1,653 molecules with a total number of atoms ranging from 2 to 92 (mean: 50.9), and containing up to 54 (mean: 28.2) non-hydrogen atoms. To gain insights into the solvent effects as well as collective dispersion interactions for drug-like molecules, we have performed QM calculations supplemented with a treatment of many-body dispersion (MBD) interactions of structures and properties in the gas phase and implicit water. Thus, AQM contains over 40 global and local physicochemical properties (including ground-state and response properties) per conformer computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By addressing both molecule-solvent and dispersion interactions, AQM dataset can serve as a challenging benchmark for state-of-the-art machine learning methods for property modeling and de novo generation of large (solvated) molecules with pharmaceutical and biological relevance.


Subject(s)
Quantum Theory , Solvents , Solvents/chemistry , Pharmaceutical Preparations/chemistry , Water/chemistry , Molecular Conformation
3.
J Cheminform ; 15(1): 20, 2023 Feb 11.
Article in English | MEDLINE | ID: mdl-36774523

ABSTRACT

Artificial Intelligence is revolutionizing many aspects of the pharmaceutical industry. Deep learning models are now routinely applied to guide drug discovery projects leading to faster and improved findings, but there are still many tasks with enormous unrealized potential. One such task is the reaction yield prediction. Every year more than one fifth of all synthesis attempts result in product yields which are either zero or too low. This equates to chemical and human resources being spent on activities which ultimately do not progress the programs, leading to a triple loss when accounting for the cost of opportunity in time wasted. In this work we pre-train a BERT model on more than 16 million reactions from 4 different data sources, and fine tune it to achieve an uncertainty calibrated global yield prediction model. This model is an improvement upon state of the art not just from the increase in pre-train data but also by introducing a new embedding layer which solves a few limitations of SMILES and enables integration of additional information such as equivalents and molecule role into the reaction encoding, the model is called BERT Enriched Embedding (BEE). The model is benchmarked on an open-source dataset against a state-of-the-art synthesis focused BERT showing a near 20-point improvement in r2 score. The model is fine-tuned and tested on an internal company data benchmark, and a prospective study shows that the application of the model can reduce the total number of negative reactions (yield under 5%) ran in Janssen by at least 34%. Lastly, we corroborate the previous results through experimental validation, by directly deploying the model in an on-going drug discovery project and showing that it can also be used successfully as a reagent recommender due to its fast inference speed and reliable confidence estimation, a critical feature for industry application.

4.
J Chem Inf Model ; 62(9): 2111-2120, 2022 05 09.
Article in English | MEDLINE | ID: mdl-35034452

ABSTRACT

Finding synthesis routes for molecules of interest is essential in the discovery of new drugs and materials. To find such routes, computer-assisted synthesis planning (CASP) methods are employed, which rely on a single-step model of chemical reactivity. In this study, we introduce a template-based single-step retrosynthesis model based on Modern Hopfield Networks, which learn an encoding of both molecules and reaction templates in order to predict the relevance of templates for a given molecule. The template representation allows generalization across different reactions and significantly improves the performance of template relevance prediction, especially for templates with few or zero training examples. With inference speed up to orders of magnitude faster than baseline methods, we improve or match the state-of-the-art performance for top-k exact match accuracy for k ≥ 3 in the retrosynthesis benchmark USPTO-50k. Code to reproduce the results is available at github.com/ml-jku/mhn-react.

6.
Mol Inform ; 41(4): e2100138, 2022 04.
Article in English | MEDLINE | ID: mdl-34726834

ABSTRACT

In this paper, we compare the most popular Atom-to-Atom Mapping (AAM) tools: ChemAxon,[1] Indigo,[2] RDTool,[3] NameRXN (NextMove),[4] and RXNMapper[5] which implement different AAM algorithms. An open-source RDTool program was optimized, and its modified version ("new RDTool") was considered together with several consensus mapping strategies. The Condensed Graph of Reaction approach was used to calculate chemical distances and develop the "AAM fixer" algorithm for an automatized correction of erroneous mapping. The benchmarking calculations were performed on a Golden dataset containing 1851 manually mapped and curated reactions. The best performing RXNMapper program together with the AMM Fixer was applied to map the USPTO database. The Golden dataset, mapped USPTO and optimized RDTool are available in the GitHub repository https://github.com/Laboratoire-de-Chemoinformatique.


Subject(s)
Benchmarking , Biochemical Phenomena , Algorithms , Databases, Factual
7.
J Org Chem ; 86(23): 17344-17361, 2021 12 03.
Article in English | MEDLINE | ID: mdl-34748342

ABSTRACT

Cyclopropane fusion of the only rotatable carbon-carbon bond in furanosyl nucleosides (i.e., exocyclic 4'-5') is a powerful design strategy to arrive at conformationally constrained analogues. Herein, we report a direct stereodivergent route toward the synthesis of the four possible configurations of 4-spirocyclopropane furanoses, which have been transformed into the corresponding 4'-spirocyclic adenosine analogues. The latter showed differential inhibition of the protein methyltransferase PRMT5-MEP50 complex, with one analogue inhibiting more effectively than adenosine itself, demonstrating the utility of rationally probing 4'-5' side chain orientations.


Subject(s)
Adenosine , Nucleosides , Catalysis
8.
Org Lett ; 23(22): 8828-8833, 2021 11 19.
Article in English | MEDLINE | ID: mdl-34730365

ABSTRACT

Novel C-4',C-5' cyclobutane-fused spirocyclic ribonucleoside analogues were prepared. Thermal [2 + 2] cycloaddition between dichloroketene and readily derived 4'-exo-methylene furanoses afforded a first entry to the required constrained ribofuranoses, relying on a carbonyl transposition sequence. Alternatively, an unusual stereoselective ionic [2 + 2] cycloaddition using methyl propiolate promoted by methylaluminoxane gave a complementary, more direct approach to such ribofuranoses. Further conversion to the constrained adenosine analogues revealed promising structure-dependent inhibition of the protein methyltransferase PRMT5:MEP50 complex in the (sub)micromolar range.


Subject(s)
Adenosine
9.
Mol Inform ; 40(12): e2100119, 2021 12.
Article in English | MEDLINE | ID: mdl-34427989

ABSTRACT

The quality of experimental data for chemical reactions is a critical consideration for any reaction-driven study. However, the curation of reaction data has not been extensively discussed in the literature so far. Here, we suggest a 4 steps protocol that includes the curation of individual structures (reactants and products), chemical transformations, reaction conditions and endpoints. Its implementation in Python3 using CGRTools toolkit has been used to clean three popular reaction databases Reaxys, USPTO and Pistachio. The curated USPTO database is available in the GitHub repository (Laboratoire-de-Chemoinformatique/Reaction_Data_Cleaning).


Subject(s)
Data Curation , Databases, Factual , Reference Standards
10.
J Org Chem ; 85(23): 14989-15005, 2020 12 04.
Article in English | MEDLINE | ID: mdl-33196210

ABSTRACT

A novel class of substituted spiro[3.4]octanes can be accessed via a [2 + 2]-cycloaddition of dichloroketene on a readily prepared exo-methylene cyclopentane building block. This reaction sequence was found to be robust on a multigram scale and afforded a central spirocyclobutanone scaffold for carbocyclic nucleosides. The reactivity of this constrained building block was evaluated and compared to the corresponding 4'-spirocyclic furanose analogues. Density functional theory calculations were performed to support the observed selectivity in the carbonyl reduction of spirocyclobutanone building blocks. Starting from novel spirocyclic intermediates, we exemplified the preparation of an undescribed class of carbocyclic nucleoside analogues and provided a proof of concept for application as inhibitors for the protein methyltransferase target PRMT5.


Subject(s)
Cyclopentanes , Nucleosides , Cycloaddition Reaction
11.
Chemistry ; 25(67): 15419-15423, 2019 Dec 02.
Article in English | MEDLINE | ID: mdl-31609050

ABSTRACT

Despite the large variety of modified nucleosides that have been reported, the preparation of constrained 4'-spirocyclic adenosine analogues has received very little attention. We discovered that the [2+2]-cycloaddition of dichloroketene on readily available 4'-exo-methylene furanose sugars efficiently results in the diastereoselective formation of novel 4'-spirocyclobutanones. The reaction mechanism was investigated via density functional theory (DFT) and found to proceed either via a non-synchronous or stepwise reaction sequence, controlled by the stereochemistry at the 3'-position of the sugar substrate. The obtained dichlorocyclobutanones were converted into nucleoside analogues, providing access to a novel class of chiral 4'-spirocyclobutyl adenosine mimetics in eight steps from commercially available sugars. Assessment of the biological activity of designed 4'-spirocyclic adenosine analogues identified potent inhibitors for protein methyltransferase target PRMT5.


Subject(s)
Adenosine/chemistry , Nucleosides/analogs & derivatives , Nucleosides/chemical synthesis , Carbohydrates/chemistry , Cycloaddition Reaction , Density Functional Theory , Dichloroethylenes/chemistry , Glycosylation , Metals/chemistry , Molecular Structure , Oxidation-Reduction , Stereoisomerism , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...