Search | VHL Regional Portal

Advances in machine learning with chemical language models in molecular property and reaction outcome predictions.

Das, Manajit; Ghosh, Ankit; Sunoj, Raghavan B.

J Comput Chem ; 45(14): 1160-1176, 2024 May 30.

Article in English | MEDLINE | ID: mdl-38299229

ABSTRACT

Molecular properties and reactions form the foundation of chemical space. Over the years, innumerable molecules have been synthesized, a smaller fraction of them found immediate applications, while a larger proportion served as a testimony to creative and empirical nature of the domain of chemical science. With increasing emphasis on sustainable practices, it is desirable that a target set of molecules are synthesized preferably through a fewer empirical attempts instead of a larger library, to realize an active candidate. In this front, predictive endeavors using machine learning (ML) models built on available data acquire high timely significance. Prediction of molecular property and reaction outcome remain one of the burgeoning applications of ML in chemical science. Among several methods of encoding molecular samples for ML models, the ones that employ language like representations are gaining steady popularity. Such representations would additionally help adopt well-developed natural language processing (NLP) models for chemical applications. Given this advantageous background, herein we describe several successful chemical applications of NLP focusing on molecular property and reaction outcome predictions. From relatively simpler recurrent neural networks (RNNs) to complex models like transformers, different network architecture have been leveraged for tasks such as de novo drug design, catalyst generation, forward and retro-synthesis predictions. The chemical language model (CLM) provides promising avenues toward a broad range of applications in a time and cost-effective manner. While we showcase an optimistic outlook of CLMs, attention is also placed on the persisting challenges in reaction domain, which would optimistically be addressed by advanced algorithms tailored to chemical language and with increased availability of high-quality datasets.

Machine learning studies on asymmetric relay Heck reaction-Potential avenues for reaction development.

Das, Manajit; Sharma, Pooja; Sunoj, Raghavan B.

J Chem Phys ; 156(11): 114303, 2022 Mar 21.

Article in English | MEDLINE | ID: mdl-35317601

ABSTRACT

The integration of machine learning (ML) methods into chemical catalysis is evolving as a new paradigm for cost and time economic reaction development in recent times. Although there have been several successful applications of ML in catalysis, the prediction of enantioselectivity (ee) remains challenging. Herein, we describe a ML workflow to predict ee of an important class of catalytic asymmetric transformation, namely, the relay Heck (RH) reaction. A random forest ML model, built using quantum chemically derived mechanistically relevant physical organic descriptors as features, is found to predict the ee remarkably well with a low root mean square error of 8.0 ± 1.3. Importantly, the model is effective in predicting the unseen variants of an asymmetric RH reaction. Furthermore, we predicted the ee for thousands of unexplored complementary reactions, including those leading to a good number of bioactive frameworks, by engaging different combinations of catalysts and substrates drawn from the original dataset. Our ML model developed on the available examples would be able to assist in exploiting the fuller potential of asymmetric RH reactions through a priori predictions before the actual experimentation, which would thus help surpass the trial and error loop to a larger degree.

Molecular Insights on Solvent Effects in Organic Reactions as Obtained through Computational Chemistry Tools.

Das, Manajit; Gogoi, Achyut Ranjan; Sunoj, Raghavan B.

J Org Chem ; 87(3): 1630-1640, 2022 Feb 04.

Article in English | MEDLINE | ID: mdl-34752092

ABSTRACT

Molecular understanding of the role of protic solvents in a gamut of organic transformations can be developed using density functional and ab initio computational studies focused on the reaction mechanism. Inclusion of explicit solvent molecules in the vital TSs has been proven to be valuable toward improving the energetic estimates of organocatalytic as well as transition-metal-catalyzed organic reactions. Herein, we provide an overview of the importance of an explicit-implicit solvation model using a number of interesting examples.

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL