Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Chem Inf Model ; 64(1): 9-17, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38147829

RESUMO

Deep learning has become a powerful and frequently employed tool for the prediction of molecular properties, thus creating a need for open-source and versatile software solutions that can be operated by nonexperts. Among the current approaches, directed message-passing neural networks (D-MPNNs) have proven to perform well on a variety of property prediction tasks. The software package Chemprop implements the D-MPNN architecture and offers simple, easy, and fast access to machine-learned molecular properties. Compared to its initial version, we present a multitude of new Chemprop functionalities such as the support of multimolecule properties, reactions, atom/bond-level properties, and spectra. Further, we incorporate various uncertainty quantification and calibration methods along with related metrics as well as pretraining and transfer learning workflows, improved hyperparameter optimization, and other customization options concerning loss functions or atom/bond features. We benchmark D-MPNN models trained using Chemprop with the new reaction, atom-level, and spectra functionality on a variety of property prediction data sets, including MoleculeNet and SAMPL, and observe state-of-the-art performance on the prediction of water-octanol partition coefficients, reaction barrier heights, atomic partial charges, and absorption spectra. Chemprop enables out-of-the-box training of D-MPNN models for a variety of problem settings in fast, user-friendly, and open-source software.


Assuntos
Aprendizado de Máquina , Software , Redes Neurais de Computação , Fenômenos Químicos , Água
2.
Science ; 382(6677): eadi1407, 2023 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-38127734

RESUMO

A closed-loop, autonomous molecular discovery platform driven by integrated machine learning tools was developed to accelerate the design of molecules with desired properties. We demonstrated two case studies on dye-like molecules, targeting absorption wavelength, lipophilicity, and photooxidative stability. In the first study, the platform experimentally realized 294 unreported molecules across three automatic iterations of molecular design-make-test-analyze cycles while exploring the structure-function space of four rarely reported scaffolds. In each iteration, the property prediction models that guided exploration learned the structure-property space of diverse scaffold derivatives, which were realized with multistep syntheses and a variety of reactions. The second study exploited property models trained on the explored chemical space and previously reported molecules to discover nine top-performing molecules within a lightly explored structure-property space.

3.
J Chem Inf Model ; 63(13): 4012-4029, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37338239

RESUMO

Characterizing uncertainty in machine learning models has recently gained interest in the context of machine learning reliability, robustness, safety, and active learning. Here, we separate the total uncertainty into contributions from noise in the data (aleatoric) and shortcomings of the model (epistemic), further dividing epistemic uncertainty into model bias and variance contributions. We systematically address the influence of noise, model bias, and model variance in the context of chemical property predictions, where the diverse nature of target properties and the vast chemical chemical space give rise to many different distinct sources of prediction error. We demonstrate that different sources of error can each be significant in different contexts and must be individually addressed during model development. Through controlled experiments on data sets of molecular properties, we show important trends in model performance associated with the level of noise in the data set, size of the data set, model architecture, molecule representation, ensemble size, and data set splitting. In particular, we show that 1) noise in the test set can limit a model's observed performance when the actual performance is much better, 2) using size-extensive model aggregation structures is crucial for extensive property prediction, and 3) ensembling is a reliable tool for uncertainty quantification and improvement specifically for the contribution of model variance. We develop general guidelines on how to improve an underperforming model when falling into different uncertainty contexts.


Assuntos
Aprendizado de Máquina , Incerteza , Reprodutibilidade dos Testes
4.
Artigo em Inglês | MEDLINE | ID: mdl-36166183

RESUMO

The Inadequate Boundaries Questionnaire (IBQ) was created as a multi-dimensional measure of boundary violations in parent-child relationships. Use of the IBQ has been increasing; however, its psychometric properties, including its proposed five-factor structure, have yet to be comprehensively evaluated. The current study examined the factor structure, reliability, mother-adolescent agreement, and convergent and discriminant validity of the IBQ-Parent and -Youth English versions among community and clinical adolescents and their mothers. Confirmatory factor analysis most strongly supported four factors: Guilt Induction-Psychological Control, Parentification, No Boundaries (Enmeshment), and Triangulation. The scales showed acceptable to excellent reliability. Mother-adolescent agreement was moderate in the healthy community sample and weaker in the clinical sample. Convergent and discriminant associations supported the validity of the Guilt Induction-Psychological Control scale, with a more complex picture emerging for other scales. Implications of these findings and directions for future research with the IBQ are discussed.

5.
J Phys Chem A ; 125(23): 4943-4956, 2021 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-34101445

RESUMO

Polyesters synthesized from 2,2,4,4-tetramethyl-1,3-cyclobutanediol (TMCD) and terephthalic acid (TPA) are improved alternatives to toxic polycarbonates based on bisphenol A. In this work, we use ωB97X-D/LANL2DZdp calculations, in the presence of a benzaldehyde polarizable continuum model solvent, to show that esterification of TMCD and TPA will reduce and subsequently dehydrate a dimethyl tin oxide catalyst, becoming ligands on the now four-coordinate complex. This reaction then proceeds most plausibly by an intramolecular acyl-transfer mechanism from the tin complex, aided by a coordinated proton donor such as hydronium. These findings are a key first step in understanding polyester synthesis and avoiding undesirable side reactions during production.

6.
J Chem Inf Model ; 61(6): 2594-2609, 2021 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-34048221

RESUMO

Infrared (IR) spectroscopy remains an important tool for chemical characterization and identification. Chemprop-IR has been developed as a software package for the prediction of IR spectra through the use of machine learning. This work serves the dual purpose of providing a trained general-purpose model for the prediction of IR spectra with ease and providing the Chemprop-IR software framework for the training of new models. In Chemprop-IR, molecules are encoded using a directed message passing neural network, allowing for molecule latent representations to be learned and optimized for the task of spectral predictions. Model training incorporates spectra metrics and normalization techniques that offer better performance with spectral predictions than standard practice in regression models. The model makes use of pretraining using quantum chemistry calculations and ensembling of multiple submodels to improve generalizability and performance. The spectral predictions that result are of high quality, showing capability to capture the extreme diversity of spectral forms over chemical space and represent complex peak structures.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Software
7.
J Phys Chem A ; 123(1): 120-131, 2019 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-30484643

RESUMO

Quantum-chemical calculations show how low barriers to anomerization and shifting equilibria cause a significant presence of different monosaccharide isomers in high-temperature processes such as pyrolysis. The transition between isomeric forms of monosaccharides is long-studied, but examination has typically been limited to the solution phase and to pyranose isomers. Processes and rates of anomerization by reversible, gas-phase ring-opening and -closing reactions were predicted for the monosaccharides d-glucose, d-mannose, d-galactose, d-xylose, l-arabinose, and d-glucuronic acid. Structures and thermochemistry were computed for stable species and pericyclic transition states using CBS-QB3, and high-pressure-limit Arrhenius reaction parameters were predicted and fitted from 300 to 1000 K. Activation energies for the ring-opening reactions were 162-217 kJ/mol for four-center pericyclic separation of the lactol group but were reduced by catalytic participation of a hydroxyl group within the monosaccharide or an external R-OH group represented by an explicit water molecule, reaching activation energies as low as 97 and 67 kJ/mol, respectively. Equilibrium constants implied increasing fractions of furanose and linear aldehyde anomers with increasing temperature.

8.
Chem Commun (Camb) ; 46(28): 5136-8, 2010 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-20539881

RESUMO

Modular oxacyclophanes featuring m-terphenyl units scaffold inter-pi-system interaction in face-to-face stacked or orthogonal orientations, leading to distinct photophysical properties.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...