Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
Add more filters










Publication year range
1.
ACS Nano ; 18(22): 14514-14522, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38776469

ABSTRACT

Ligands play a critical role in the optical properties and chemical stability of colloidal nanocrystals (NCs), but identifying ligands that can enhance NC properties is daunting, given the high dimensionality of chemical space. Here, we use machine learning (ML) and robotic screening to accelerate the discovery of ligands that enhance the photoluminescence quantum yield (PLQY) of CsPbBr3 perovskite NCs. We developed a ML model designed to predict the relative PL enhancement of perovskite NCs when coordinated with a ligand selected from a pool of 29,904 candidate molecules. Ligand candidates were selected using an active learning (AL) approach that accounted for uncertainty quantified by twin regressors. After eight experimental iterations of batch AL (corresponding to 21 initial and 72 model-recommended ligands), the uncertainty of the model decreased, demonstrating an increased confidence in the model predictions. Feature importance and counterfactual analyses of model predictions illustrate the potential use of ligand field strength in designing PL-enhancing ligands. Our versatile AL framework can be readily adapted to screen the effect of ligands on a wide range of colloidal nanomaterials.

2.
Digit Discov ; 3(1): 23-33, 2024 Jan 17.
Article in English | MEDLINE | ID: mdl-38239898

ABSTRACT

In light of the pressing need for practical materials and molecular solutions to renewable energy and health problems, to name just two examples, one wonders how to accelerate research and development in the chemical sciences, so as to address the time it takes to bring materials from initial discovery to commercialization. Artificial intelligence (AI)-based techniques, in particular, are having a transformative and accelerating impact on many if not most, technological domains. To shed light on these questions, the authors and participants gathered in person for the ASLLA Symposium on the theme of 'Accelerated Chemical Science with AI' at Gangneung, Republic of Korea. We present the findings, ideas, comments, and often contentious opinions expressed during four panel discussions related to the respective general topics: 'Data', 'New applications', 'Machine learning algorithms', and 'Education'. All discussions were recorded, transcribed into text using Open AI's Whisper, and summarized using LG AI Research's EXAONE LLM, followed by revision by all authors. For the broader benefit of current researchers, educators in higher education, and academic bodies such as associations, publishers, librarians, and companies, we provide chemistry-specific recommendations and summarize the resulting conclusions.

3.
J Phys Chem B ; 127(37): 7964-7973, 2023 Sep 21.
Article in English | MEDLINE | ID: mdl-37682958

ABSTRACT

Aqueous, two-phase systems (ATPSs) may form upon mixing two solutions of independently water-soluble compounds. Many separation, purification, and extraction processes rely on ATPSs. Predicting the miscibility of solutions can accelerate and reduce the cost of the discovery of new ATPSs for these applications. Whereas previous machine learning approaches to ATPS prediction used physicochemical properties of each solute as a descriptor, in this work, we show how to impute missing miscibility outcomes directly from an incomplete collection of pairwise miscibility experiments. We use graph-regularized logistic matrix factorization (GR-LMF) to learn a latent vector of each solution from (i) the observed entries in the pairwise miscibility matrix and (ii) a graph where each node is a solution and edges are relationships indicating the general category of the solute (i.e., polymer, surfactant, salt, protein). For an experimental data set of the pairwise miscibility of 68 solutions from Peacock et al. [ACS Appl. Mater. Interfaces 2021, 13, 11449-11460], we find that GR-LMF more accurately predicts missing (im)miscibility outcomes of pairs of solutions than ordinary logistic matrix factorization and random forest classifiers that use physicochemical features of the solutes. GR-LMF obviates the need for features of the solutions and solutions to impute missing miscibility outcomes, but it cannot predict the miscibility of a new solution without some observations of its miscibility with other solutions in the training data set.

4.
J Am Chem Soc ; 145(40): 21699-21716, 2023 Oct 11.
Article in English | MEDLINE | ID: mdl-37754929

ABSTRACT

Exceptional molecules and materials with one or more extraordinary properties are both technologically valuable and fundamentally interesting, because they often involve new physical phenomena or new compositions that defy expectations. Historically, exceptionality has been achieved through serendipity, but recently, machine learning (ML) and automated experimentation have been widely proposed to accelerate target identification and synthesis planning. In this Perspective, we argue that the data-driven methods commonly used today are well-suited for optimization but not for the realization of new exceptional materials or molecules. Finding such outliers should be possible using ML, but only by shifting away from using traditional ML approaches that tweak the composition, crystal structure, or reaction pathway. We highlight case studies of high-Tc oxide superconductors and superhard materials to demonstrate the challenges of ML-guided discovery and discuss the limitations of automation for this task. We then provide six recommendations for the development of ML methods capable of exceptional materials discovery: (i) Avoid the tyranny of the middle and focus on extrema; (ii) When data are limited, qualitative predictions that provide direction are more valuable than interpolative accuracy; (iii) Sample what can be made and how to make it and defer optimization; (iv) Create room (and look) for the unexpected while pursuing your goal; (v) Try to fill-in-the-blanks of input and output space; (vi) Do not confuse human understanding with model interpretability. We conclude with a description of how these recommendations can be integrated into automated discovery workflows, which should enable the discovery of exceptional molecules and materials.

5.
Acta Crystallogr C Struct Chem ; 79(Pt 1): 12-17, 2023 Jan 01.
Article in English | MEDLINE | ID: mdl-36602016

ABSTRACT

The title compound, [Al4(CH3)8(C2H7N)2H2], crystallizes as eight-membered rings with -(CH3)2Al-(CH3)2N-(CH3)2Al- moieties connected by single hydride bridges. In the X-ray structure, the ring has a chair conformation, with the hydride H atoms being close to the plane through the four Al atoms. An optimized structure was also calculated by all-electron density functional theory (DFT) methods, which agrees with the X-ray structure but gives a somewhat different geometry for the hydride bridge. Charges on the individual atoms were determined by valence shell occupancy refinements using MoPro and also by DFT calculations analyzed by several different methods. All methods agree in assigning a positive charge to the Al atoms, negative charges to the C, N, and hydride H atoms, and small positive charges to the methyl H atoms.

6.
HardwareX ; 12: e00319, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35677813

ABSTRACT

The Sidekick is a desktop liquid dispenser, compatible with standard SBS microplates and designed for accessible laboratory automation. It features an armature-based motion system and a fully 3D-printed chassis to reduce overall mechanical complexity and accommodate user modification. Liquid dispensing is achieved with four commercially available solenoid driven positive displacement pumps that deliver liquid in 10 µL increments. A Raspberry Pi Pico RP2040 processor programmed in MicroPython is used for control, and exposes a USB serial interface for users to submit commands using either a simple vocabulary of commands or a subset of G-Code. At a total cost of $710 USD, the Sidekick offers laboratories an easy to build, easily maintained, open-source liquid dispensing system for both research and pedagogical introductions to lab automation.

7.
J Chem Phys ; 156(6): 064108, 2022 Feb 14.
Article in English | MEDLINE | ID: mdl-35168359

ABSTRACT

Autonomous experimentation systems use algorithms and data from prior experiments to select and perform new experiments in order to meet a specified objective. In most experimental chemistry situations, there is a limited set of prior historical data available, and acquiring new data may be expensive and time consuming, which places constraints on machine learning methods. Active learning methods prioritize new experiment selection by using machine learning model uncertainty and predicted outcomes. Meta-learning methods attempt to construct models that can learn quickly with a limited set of data for a new task. In this paper, we applied the model-agnostic meta-learning (MAML) model and the Probabilistic LATent model for Incorporating Priors and Uncertainty in few-Shot learning (PLATIPUS) approach, which extends MAML to active learning, to the problem of halide perovskite growth by inverse temperature crystallization. Using a dataset of 1870 reactions conducted using 19 different organoammonium lead iodide systems, we determined the optimal strategies for incorporating historical data into active and meta-learning models to predict reaction compositions that result in crystals. We then evaluated the best three algorithms (PLATIPUS and active-learning k-nearest neighbor and decision tree algorithms) with four new chemical systems in experimental laboratory tests. With a fixed budget of 20 experiments, PLATIPUS makes superior predictions of reaction outcomes compared to other active-learning algorithms and a random baseline.

8.
Nat Rev Chem ; 6(5): 357-370, 2022 May.
Article in English | MEDLINE | ID: mdl-37117931

ABSTRACT

The physical sciences community is increasingly taking advantage of the possibilities offered by modern data science to solve problems in experimental chemistry and potentially to change the way we design, conduct and understand results from experiments. Successfully exploiting these opportunities involves considerable challenges. In this Expert Recommendation, we focus on experimental co-design and its importance to experimental chemistry. We provide examples of how data science is changing the way we conduct experiments, and we outline opportunities for further integration of data science and experimental chemistry to advance these fields. Our recommendations include establishing stronger links between chemists and data scientists; developing chemistry-specific data science methods; integrating algorithms, software and hardware to 'co-design' chemistry experiments from inception; and combining diverse and disparate data sources into a data network for chemistry research.

9.
ACS Appl Mater Interfaces ; 13(50): 59892-59903, 2021 Dec 22.
Article in English | MEDLINE | ID: mdl-34890203

ABSTRACT

The electrochemical oxidation of small organic molecules (SOMs) such as methanol and glucose is a critical process and has relevant applications in fuel cells and sensors. A key challenge in SOM oxidation is the poisoning of the surface by carbon monoxide (CO) and other partially oxidized intermediates, which is attributed to the presence of Pt-Pt pair sites. A promising pathway for overcoming this challenge is to develop catalysts that selectively oxidize SOMs via "direct" pathways that do not form CO as a primary intermediate. In this report, we utilize an ambient, template-based approach to prepare PtAu alloy nanowires with tunable compositions. X-ray photoelectron spectroscopy measurements reveal that the surface composition matches that of the bulk composition after synthesis. Monte Carlo method simulations of the surface structure of PtAu alloys with varying coverage of oxygen adsorbates and varying degrees of oxygen adsorption strength reveal that oxygen adsorption under electrochemical conditions enriches the surface with Pt and a large fraction of Pt-Pt sites remain on the surface even with the Au content of up to 50%. Electrochemical properties and the catalytic performance measurements of the PtAu nanowires for the oxidation of methanol and glucose reveal that the mechanistic pathways that produce CO are suppressed by the addition of relatively small quantities of Au (∼10%), and CO formation can be completely suppressed by 50% Au. The suppression of CO formation with small quantities of Au suggests that the presence of Pt-Au pair sites may be more important in determining the mechanism of SOM oxidation rather than Pt-Pt pair site density.

10.
Environ Sci Technol ; 55(19): 12741-12754, 2021 10 05.
Article in English | MEDLINE | ID: mdl-34403250

ABSTRACT

The rapid increase in both the quantity and complexity of data that are being generated daily in the field of environmental science and engineering (ESE) demands accompanied advancement in data analytics. Advanced data analysis approaches, such as machine learning (ML), have become indispensable tools for revealing hidden patterns or deducing correlations for which conventional analytical methods face limitations or challenges. However, ML concepts and practices have not been widely utilized by researchers in ESE. This feature explores the potential of ML to revolutionize data analysis and modeling in the ESE field, and covers the essential knowledge needed for such applications. First, we use five examples to illustrate how ML addresses complex ESE problems. We then summarize four major types of applications of ML in ESE: making predictions; extracting feature importance; detecting anomalies; and discovering new materials or chemicals. Next, we introduce the essential knowledge required and current shortcomings in ML applications in ESE, with a focus on three important but often overlooked components when applying ML: correct model development, proper model interpretation, and sound applicability analysis. Finally, we discuss challenges and future opportunities in the application of ML tools in ESE to highlight the potential of ML in this field.


Subject(s)
Environmental Science , Machine Learning
11.
J Chem Phys ; 154(18): 184708, 2021 May 14.
Article in English | MEDLINE | ID: mdl-34241022

ABSTRACT

Amine-templated metal oxides are a class of hybrid organic-inorganic compounds with great structural diversity; by varying the compositions, 0D, 1D, 2D, and 3D inorganic dimensionalities can be achieved. In this work, we created a dataset of 3725 amine-templated metal oxides (including some metalloid oxides), their composition, amine identity, and dimensionality, extracted from the Cambridge Structure Database (CSD), which spans 71 elements, 25 main group building units, and 349 amines. We characterize the diversity of this dataset over reactants and in time. Artificial neural network models trained on this dataset can predict the most and least probable outcome dimensionalities with 71% and 95% accuracies, respectively, using only information about reactant identities, without stoichiometric information. Surprisingly, the amine identity plays only a minor role in most cases, as omitting this information only reduces the accuracy by <2%. The generality of this model is demonstrated on a time held-out test set of 36 amine-templated lanthanide oxalates, vanadium tellurites, vanadium selenites, vanadates, molybdates, and molybdenum sulfates, whose syntheses and structural characterizations are reported here for the first time, and which contain two new element combinations and four amines that are not present in the CSD.

12.
J Chem Inf Model ; 61(4): 1593-1602, 2021 04 26.
Article in English | MEDLINE | ID: mdl-33797887

ABSTRACT

Combinatorial fusion analysis (CFA) is an approach for combining multiple scoring systems using the rank-score characteristic function and cognitive diversity measure. One example is to combine diverse machine learning models to achieve better prediction quality. In this work, we apply CFA to the synthesis of metal halide perovskites containing organic ammonium cations via inverse temperature crystallization. Using a data set generated by high-throughput experimentation, four individual models (support vector machines, random forests, weighted logistic classifier, and gradient boosted trees) were developed. We characterize each of these scoring systems and explore 66 possible combinations of the models. When measured by the precision on predicting crystal formation, the majority of the combination models improves the individual model results. The best combination models outperform the best individual models by 3.9 percentage points in precision. In addition to improving prediction quality, we demonstrate how the fusion models can be used to identify mislabeled input data and address issues of data quality. In particular, we identify example cases where all single models and all fusion models do not give the correct prediction. Experimental replication of these syntheses reveals that these compositions are sensitive to modest temperature variations across the different locations of the heating element that can hinder or enhance the crystallization process. In summary, we demonstrate that model fusion using CFA can not only identify a previously unconsidered influence on reaction outcome but also be used as a form of quality control for high-throughput experimentation.


Subject(s)
Machine Learning , Support Vector Machine , Calcium Compounds , Oxides , Titanium
13.
J Phys Chem B ; 125(12): 3057-3065, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33739115

ABSTRACT

Predicting protein stability is a challenge due to the many competing thermodynamic effects. Through de novo protein design, one begins with a target structure and searches for a sequence that will fold into it. Previous work by Rocklin et al. introduced a data set of more than 16,000 miniproteins spanning four structural topologies with information on stability. These structures were characterized with a set of 46 structural descriptors, with no explicit inclusion of configurational entropy (Scnf). Our work focused on creating a set of 17 descriptors intended to capture variations in Scnf and its comparison to an extended set of 113 structural and energy model features that extend the Rocklin et al. feature set (R). The Scnf descriptors statistically discriminate between stable and unstable distributions within topologies and best describe EEHEE topology stability (where E = ß sheet and H = α helix). Between 50 and 80% of the variation in each Scnf descriptor is described by linear combinations of R features. Despite containing useful information about minipeptide stability, providing Scnf features as inputs to machine learning models does not improve overall performance when predicting protein stability, as the R features sufficiently capture the implicit variations.


Subject(s)
Proteins , Entropy , Thermodynamics
14.
J Chem Inf Model ; 60(8): 3804-3811, 2020 08 24.
Article in English | MEDLINE | ID: mdl-32668151

ABSTRACT

Coulomb matrix eigenvalues (CMEs) are global 3D representations of molecular structure, which have been previously used to predict atomization energies, prioritize geometry searches, and interpret rotational spectra. The properties of the CME representation and its relationship to molecular structure are established using the Gershgorin circle theorem. Numerical bounds are studied using a data set of 309 000 conformational samples of all constitutional isomers of acyclic alkanes, CnH2n+2, from methane (n = 1) to undecane (n = 11), to establish the extent to which the CME preserves chemical intuitions about isomer and conformer similarity and its ability to distinguish constitutional isomers. Neither supervised nor unsupervised machine-learning algorithms can perfectly distinguish constitutional isomers as the molecular size increases, but the misclassification rate can be kept below 1%.


Subject(s)
Algorithms , Unsupervised Machine Learning , Isomerism , Molecular Conformation , Molecular Structure
15.
J Phys Chem Lett ; 11(12): 4901-4910, 2020 Jun 18.
Article in English | MEDLINE | ID: mdl-32491860

ABSTRACT

The state-to-state intraband relaxation dynamics of charge carriers photogenerated within CdTe quantum wires (QWs) are characterized via transient absorption spectroscopy. Overlapping signals from the energetic-shifting of the quantum-confinement features and the occupancy of carriers in the states associated with these features are separated using the quantum-state renormalization model. Holes generated with an excitation energy of 2.75 eV reach the band edge within the instrument response of the measurement, ∼200 fs. This extremely short relaxation time is consistent with the low photoluminescence quantum yield of the QWs, ∼0.2%, and the presence of alternative relaxation pathways for the holes. The electrons relax through the different energetically available quantum-confinement states, likely via phonon coupling, with an overall rate of ∼0.6 eV ps-1.

16.
J Am Chem Soc ; 142(16): 7555-7566, 2020 04 22.
Article in English | MEDLINE | ID: mdl-32233475

ABSTRACT

Racemates have recently received attention as nonlinear optical and piezoelectric materials. Here, a machine-learning-assisted composition space approach was applied to synthesize the missing M = Ti, Zr members of the Δ,Λ-[Cu(bpy)2(H2O)]2[MF6]2·3H2O (M = Ti, Zr, Hf; bpy = 2,2'-bipyridine) family (space group: Pna21). In each (CuO, MO2)/bpy/HF(aq) (M = Ti, Zr, Hf) system, the polar noncentrosymmetric racemate (M-NCS) forms in competition with a centrosymmetric one-dimensional chain compound (M-CS) based on alternating Cu(bpy)(H2O)22+ and MF62- basic building units (space groups: Ti-CS (Pnma), Zr-CS (P1̅), Hf-CS (P2/n)). Machine learning models were trained on reaction parameters to gain unbiased insight into the underlying statistical trends in each composition space. A human-interpretable decision tree shows that phase selection is driven primarily by the bpy:CuO molar ratio for reactions containing Zr or Hf, and predicts that formation of the Ti-NCS compound requires that the amount of HF present be decreased to raise the pH, which we verified experimentally. Predictive leave-one-metal-out (LOO) models further confirm that behavior in the Ti system is distinct from that of the Zr and Hf systems. The chemical origin of this distinction was probed via fluorine K-edge X-ray absorption spectroscopy. Pre-edge features in the F1s X-ray absorption spectra reveal the strong ligand-to-metal π bonding between Ti(3d - t2g) and F(2p) states that distinguishes the TiF62- anion from the ZrF62- and HfF62- anions.

17.
Nature ; 573(7773): 251-255, 2019 09.
Article in English | MEDLINE | ID: mdl-31511682

ABSTRACT

Most chemical experiments are planned by human scientists and therefore are subject to a variety of human cognitive biases1, heuristics2 and social influences3. These anthropogenic chemical reaction data are widely used to train machine-learning models4 that are used to predict organic5 and inorganic6,7 syntheses. However, it is known that societal biases are encoded in datasets and are perpetuated in machine-learning models8. Here we identify as-yet-unacknowledged anthropogenic biases in both the reagent choices and reaction conditions of chemical reaction datasets using a combination of data mining and experiments. We find that the amine choices in the reported crystal structures of hydrothermal synthesis of amine-templated metal oxides9 follow a power-law distribution in which 17% of amine reactants occur in 79% of reported compounds, consistent with distributions in social influence models10-12. An analysis of unpublished historical laboratory notebook records shows similarly biased distributions of reaction condition choices. By performing 548 randomly generated experiments, we demonstrate that the popularity of reactants or the choices of reaction conditions are uncorrelated to the success of the reaction. We show that randomly generated experiments better illustrate the range of parameter choices that are compatible with crystal formation. Machine-learning models that we train on a smaller randomized reaction dataset outperform models trained on larger human-selected reaction datasets, demonstrating the importance of identifying and addressing anthropogenic biases in scientific data.


Subject(s)
Bias , Chemistry Techniques, Synthetic/statistics & numerical data , Laboratory Personnel/statistics & numerical data , Machine Learning , Humans , Laboratory Personnel/psychology
18.
J Phys Chem A ; 123(15): 3239-3240, 2019 Apr 18.
Article in English | MEDLINE | ID: mdl-30995844
19.
J Phys Chem B ; 123(15): 3145-3146, 2019 04 18.
Article in English | MEDLINE | ID: mdl-30995845
20.
ACS Appl Mater Interfaces ; 9(49): 43061-43071, 2017 Dec 13.
Article in English | MEDLINE | ID: mdl-29156127

ABSTRACT

Membrane-based gas separation processes can address key challenges in energy and environment, but for many applications the permeance and selectivity of bulk membranes is insufficient for economical use. Theory and experiment indicate that permeance and selectivity can be increased by using two-dimensional materials with subnanometer pores as membranes. Motivated by experiments showing selective permeation of H2/CO mixtures through amorphous silica bilayers, here we perform a theoretical study of gas separation through silica bilayers. Using density functional theory calculations, we obtain geometries of crystalline free-standing silica bilayers (comprised of six-membered rings), as well as the seven-, eight-, and nine-membered rings that are observed in glassy silica bilayers, which arise due to Stone-Wales defects and vacancies. We then compute the potential energy barriers for gas passage through these various pore types for He, Ne, Ar, Kr, H2, N2, CO, and CO2 gases, and use the data to assess their capability for selective gas separation. Our calculations indicate that crystalline bilayer silica, which is less than a nanometer thick, can be a high-selectivity and high-permeance membrane material for 3He/4He, He/natural gas, and H2/CO separations.

SELECTION OF CITATIONS
SEARCH DETAIL
...