RESUMO
Over the past decade, pharmaceutical companies have seen a decline in the number of drug candidates successfully passing through clinical trials, though billions are still spent on drug development. Poor aqueous solubility leads to low bio-availability, reducing pharmaceutical effectiveness. The human cost of inefficient drug candidate testing is of great medical concern, with fewer drugs making it to the production line, slowing the development of new treatments. In biochemistry and biophysics, water mediated reactions and interactions within active sites and protein pockets are an active area of research, in which methods for modelling solvated systems are continually pushed to their limits. Here, we discuss a multitude of methods aimed towards solvent modelling and solubility prediction, aiming to inform the reader of the options available, and outlining the various advantages and disadvantages of each approach.
Assuntos
Modelos Químicos , Soluções/química , Termodinâmica , SolubilidadeRESUMO
In this work we make predictions of several important molecular properties of academic and industrial importance to seek answers to two questions: 1) Can we apply efficient machine learning techniques, using inexpensive descriptors, to predict melting points to a reasonable level of accuracy? 2) Can values of this level of accuracy be usefully applied to predicting aqueous solubility? We present predictions of melting points made by several novel machine learning models, previously applied to solubility prediction. Additionally, we make predictions of solubility via the General Solubility Equation (GSE) and monitor the impact of varying the logP prediction model (AlogP and XlogP) on the GSE. We note that the machine learning models presented, using a modest number of 2D descriptors, can make melting point predictions in line with the current state of the art prediction methods (RMSE≥40 °C). We also find that predicted melting points, with an RMSE of tens of degrees Celsius, can be usefully applied to the GSE to yield accurate solubility predictions (log10 S RMSE<1) over a small dataset of drug-like molecules.