Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 93
Filtrar
1.
Commun Chem ; 7(1): 21, 2024 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-38355806

RESUMO

Metal-organic frameworks (MOFs) exhibit great promise for CO2 capture. However, finding the best performing materials poses computational and experimental grand challenges in view of the vast chemical space of potential building blocks. Here, we introduce GHP-MOFassemble, a generative artificial intelligence (AI), high performance framework for the rational and accelerated design of MOFs with high CO2 adsorption capacity and synthesizable linkers. GHP-MOFassemble generates novel linkers, assembled with one of three pre-selected metal nodes (Cu paddlewheel, Zn paddlewheel, Zn tetramer) into MOFs in a primitive cubic topology. GHP-MOFassemble screens and validates AI-generated MOFs for uniqueness, synthesizability, structural validity, uses molecular dynamics simulations to study their stability and chemical consistency, and crystal graph neural networks and Grand Canonical Monte Carlo simulations to quantify their CO2 adsorption capacities. We present the top six AI-generated MOFs with CO2 capacities greater than 2m mol g-1, i.e., higher than 96.9% of structures in the hypothetical MOF dataset.

2.
J Chem Inf Model ; 64(4): 1277-1289, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38359461

RESUMO

Predicting the synthesizability of a new molecule remains an unsolved challenge that chemists have long tackled with heuristic approaches. Here, we report a new method for predicting synthesizability using a simple yet accurate thermochemical descriptor. We introduce Emin, the energy difference between a molecule and its lowest energy constitutional isomer, as a synthesizability predictor that is accurate, physically meaningful, and first-principles based. We apply Emin to 134,000 molecules in the QM9 data set and find that Emin is accurate when used alone and reduces incorrect predictions of "synthesizable" by up to 52% when used to augment commonly used prediction methods. Our work illustrates how first-principles thermochemistry and heuristic approximations for molecular stability are complementary, opening a new direction for synthesizability prediction methods.


Assuntos
Heurística , Isomerismo
4.
Nat Commun ; 14(1): 7059, 2023 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-37923741

RESUMO

Coherent imaging techniques provide an unparalleled multi-scale view of materials across scientific and technological fields, from structural materials to quantum devices, from integrated circuits to biological cells. Driven by the construction of brighter sources and high-rate detectors, coherent imaging methods like ptychography are poised to revolutionize nanoscale materials characterization. However, these advancements are accompanied by significant increase in data and compute needs, which precludes real-time imaging, feedback and decision-making capabilities with conventional approaches. Here, we demonstrate a workflow that leverages artificial intelligence at the edge and high-performance computing to enable real-time inversion on X-ray ptychography data streamed directly from a detector at up to 2 kHz. The proposed AI-enabled workflow eliminates the oversampling constraints, allowing low-dose imaging using orders of magnitude less data than required by traditional methods.

6.
Light Sci Appl ; 12(1): 196, 2023 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-37596264

RESUMO

The dynamics and structure of mixed phases in a complex fluid can significantly impact its material properties, such as viscoelasticity. Small-angle X-ray Photon Correlation Spectroscopy (SA-XPCS) can probe the spontaneous spatial fluctuations of the mixed phases under various in situ environments over wide spatiotemporal ranges (10-6-103 s /10-10-10-6 m). Tailored material design, however, requires searching through a massive number of sample compositions and experimental parameters, which is beyond the bandwidth of the current coherent X-ray beamline. Using 3.7-µs-resolved XPCS synchronized with the clock frequency at the Advanced Photon Source, we demonstrated the consistency between the Brownian dynamics of ~100 nm diameter colloidal silica nanoparticles measured from an enclosed pendant drop and a sealed capillary. The electronic pipette can also be mounted on a robotic arm to access different stock solutions and create complex fluids with highly-repeatable and precisely controlled composition profiles. This closed-loop, AI-executable protocol is applicable to light scattering techniques regardless of the light wavelength and optical coherence, and is a first step towards high-throughput, autonomous material discovery.

7.
J Chem Phys ; 159(2)2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-37428051

RESUMO

Machine learning interatomic potentials have emerged as a powerful tool for bypassing the spatiotemporal limitations of ab initio simulations, but major challenges remain in their efficient parameterization. We present AL4GAP, an ensemble active learning software workflow for generating multicomposition Gaussian approximation potentials (GAP) for arbitrary molten salt mixtures. The workflow capabilities include: (1) setting up user-defined combinatorial chemical spaces of charge neutral mixtures of arbitrary molten mixtures spanning 11 cations (Li, Na, K, Rb, Cs, Mg, Ca, Sr, Ba and two heavy species, Nd, and Th) and 4 anions (F, Cl, Br, and I), (2) configurational sampling using low-cost empirical parameterizations, (3) active learning for down-selecting configurational samples for single point density functional theory calculations at the level of Strongly Constrained and Appropriately Normed (SCAN) exchange-correlation functional, and (4) Bayesian optimization for hyperparameter tuning of two-body and many-body GAP models. We apply the AL4GAP workflow to showcase high throughput generation of five independent GAP models for multicomposition binary-mixture melts, each of increasing complexity with respect to charge valency and electronic structure, namely: LiCl-KCl, NaCl-CaCl2, KCl-NdCl3, CaCl2-NdCl3, and KCl-ThCl4. Our results indicate that GAP models can accurately predict structure for diverse molten salt mixture with density functional theory (DFT)-SCAN accuracy, capturing the intermediate range ordering characteristic of the multivalent cationic melts.

9.
Sci Data ; 10(1): 356, 2023 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-37277408

RESUMO

The availability of materials data for impact-mitigating materials has lagged behind applications-based data. For example, data describing on-field helmeted impacts are available, whereas material behaviors for the constituent impact-mitigating materials used in helmet designs lack open datasets. Here, we describe a new FAIR (findable, accessible, interoperable, reusable) data framework with structural and mechanical response data for one example elastic impact protection foam. The continuum-scale behavior of foams emerges from the interplay of polymer properties, internal gas, and geometric structure. This behavior is rate and temperature sensitive, therefore, describing structure-property characteristics requires data collected across several types of instruments. Data included are from structure imaging via micro-computed tomography, finite deformation mechanical measurements from universal test systems with full-field displacement and strain, and visco-thermo-elastic properties from dynamic mechanical analysis. These data facilitate modeling and design efforts in foam mechanics, e.g., homogenization, direct numerical simulation, or phenomenological fitting. The data framework is implemented using data services and software from the Materials Data Facility of the Center for Hierarchical Materials Design.

10.
Ultrasound ; 31(1): 23-32, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36794114

RESUMO

Objectives: Abdominal aortic aneurysm ultrasound surveillance varies between hospitals in the United Kingdom. University Hospitals Bristol and Weston adopt a 6-monthly surveillance interval for 4.5-4.9 cm abdominal aortic aneurysm, which is a deviation from nationally recommended 3-monthly intervals. Assessment of abdominal aortic aneurysm growth rate, and the concurrent impact of abdominal aortic aneurysm risk factors and medications prescribed for risk factors, may inform whether this change in surveillance intervals is safe and appropriate. Methods: This analysis was conducted retrospectively. A total of 1312 abdominal aortic aneurysm ultrasound scans from 315 patients between January 2015 and March 2020 were split into 0.5 cm groups, ranging from 3.0 to 5.5 cm. Abdominal aortic aneurysm growth rate was assessed with one-way analysis of variance. The impact of risk factors and risk factor medication on abdominal aortic aneurysm growth rate was analysed using multivariate and univariate linear regression and Kruskal-Wallis tests. Patient cause of death among surveillance patients was recorded. Results: Abdominal aortic aneurysm growth rate was significantly associated with increased abdominal aortic aneurysm diameter (p < 0.001). There was a significant whole-group reduction in growth rate from 0.29 to 0.19 cm/year in diabetics compared to non-diabetics (p = 0.02), supported by univariate linear regression (p = 0.04). In addition, gliclazide patients had lower growth rate compared to patients not on the medication (p = 0.04). One abdominal aortic aneurysm rupture occurred <5.5 cm resulting in death. Conclusion: Abdominal aortic aneurysm measuring 4.5-4.9 cm had a mean growth rate of 0.3 cm/year (± 0.18 cm/year). Therefore, mean growth rate and variability suggest patients are unlikely to surpass surgical threshold of 5.5 cm between the 6-monthly surveillance scans, supported by low rupture rates. This suggests the surveillance interval for 4.5-4.9 cm abdominal aortic aneurysm is a safe and appropriate deviation from national guidance. In addition, it may be pertinent to consider diabetic status when designing surveillance intervals.

11.
bioRxiv ; 2022 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-36451881

RESUMO

We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.

13.
Gigascience ; 112022 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-36409836

RESUMO

The Common Fund Data Ecosystem (CFDE) has created a flexible system of data federation that enables researchers to discover datasets from across the US National Institutes of Health Common Fund without requiring that data owners move, reformat, or rehost those data. This system is centered on a catalog that integrates detailed descriptions of biomedical datasets from individual Common Fund Programs' Data Coordination Centers (DCCs) into a uniform metadata model that can then be indexed and searched from a centralized portal. This Crosscut Metadata Model (C2M2) supports the wide variety of data types and metadata terms used by individual DCCs and can readily describe nearly all forms of biomedical research data. We detail its use to ingest and index data from 11 DCCs.


Assuntos
Ecossistema , Administração Financeira , Metadados
14.
Sci Data ; 9(1): 657, 2022 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-36357431

RESUMO

A concise and measurable set of FAIR (Findable, Accessible, Interoperable and Reusable) principles for scientific data is transforming the state-of-practice for data management and stewardship, supporting and enabling discovery and innovation. Learning from this initiative, and acknowledging the impact of artificial intelligence (AI) in the practice of science and engineering, we introduce a set of practical, concise, and measurable FAIR principles for AI models. We showcase how to create and share FAIR data and AI models within a unified computational framework combining the following elements: the Advanced Photon Source at Argonne National Laboratory, the Materials Data Facility, the Data and Learning Hub for Science, and funcX, and the Argonne Leadership Computing Facility (ALCF), in particular the ThetaGPU supercomputer and the SambaNova DataScale® system at the ALCF AI Testbed. We describe how this domain-agnostic computational framework may be harnessed to enable autonomous AI-driven discovery.

15.
Patterns (N Y) ; 3(10): 100606, 2022 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-36277824

RESUMO

Powerful detectors at modern experimental facilities routinely collect data at multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets of such massive data streams, such as by explicitly discarding some data elements or by directing instruments to relevant areas of experimental space. Thus, methods are required for configuring and running distributed computing pipelines-what we call flows-that link instruments, computers (e.g., for analysis, simulation, artificial intelligence [AI] model training), edge computing (e.g., for analysis), data stores, metadata catalogs, and high-speed networks. We review common patterns associated with such flows and describe methods for instantiating these patterns. We present experiences with the application of these methods to the processing of data from five different scientific instruments, each of which engages powerful computers for data inversion,model training, or other purposes. We also discuss implications of such methods for operators and users of scientific facilities.

16.
J Synchrotron Radiat ; 29(Pt 5): 1141-1151, 2022 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-36073872

RESUMO

Serial synchrotron crystallography enables the study of protein structures under physiological temperature and reduced radiation damage by collection of data from thousands of crystals. The Structural Biology Center at Sector 19 of the Advanced Photon Source has implemented a fixed-target approach with a new 3D-printed mesh-holder optimized for sample handling. The holder immobilizes a crystal suspension or droplet emulsion on a nylon mesh, trapping and sealing a near-monolayer of crystals in its mother liquor between two thin Mylar films. Data can be rapidly collected in scan mode and analyzed in near real-time using piezoelectric linear stages assembled in an XYZ arrangement, controlled with a graphical user interface and analyzed using a high-performance computing pipeline. Here, the system was applied to two ß-lactamases: a class D serine ß-lactamase from Chitinophaga pinensis DSM 2588 and L1 metallo-ß-lactamase from Stenotrophomonas maltophilia K279a.


Assuntos
Stenotrophomonas maltophilia , Biologia , Cristalografia , Proteínas
17.
Artigo em Inglês | MEDLINE | ID: mdl-36035065

RESUMO

The broad sharing of research data is widely viewed as critical for the speed, quality, accessibility, and integrity of science. Despite increasing efforts to encourage data sharing, both the quality of shared data and the frequency of data reuse remain stubbornly low. We argue here that a significant reason for this unfortunate state of affairs is that the organization of research results in the findable, accessible, interoperable, and reusable (FAIR) form required for reuse is too often deferred to the end of a research project when preparing publications-by which time essential details are no longer accessible. Thus, we propose an approach to research informatics in which FAIR principles are applied continuously, from the inception of a research project and ubiquitously, to every data asset produced by experiment or computation. We suggest that this seemingly challenging task can be made feasible by the adoption of simple tools, such as lightweight identifiers (to ensure that every data asset is findable), packaging methods (to facilitate understanding of data contents), data access methods, and metadata organization and structuring tools (to support schema development and evolution). We use an example from experimental neuroscience to illustrate how these methods can work in practice.

18.
Computer (Long Beach Calif) ; 55(8): 20-30, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35979414

RESUMO

Despite much creative work on methods and tools, reproducibility-the ability to repeat the computational steps used to obtain a research result-remains elusive. One reason for these difficulties is that extant tools for capturing research processes, while powerful, often fail to capture vital connections as research projects grow in extent and complexity. We explain here how these interstitial connections can be preserved via simple methods that integrate easily with current work practices to capture basic information about every data product consumed or produced in a project. By thus extending the scope of findable, accessible, interoperable, and reusable (FAIR) data in both time and space to enable the creation of a continuous chain of Continuous and Ubiquitous FAIRness linkages (CUF-links) from inputs to outputs, such mechanisms can facilitate capture of the provenance linkages that are essential to reproducible research. We give examples of mechanisms that can facilitate the use of these methods, and review how they have been applied in practice.

19.
Nanomaterials (Basel) ; 12(14)2022 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-35889588

RESUMO

Mixed-valence cerium oxide nanoparticles (nanoceria) have been investigated with pronounced interest due to a wide range of biomedical and industrial applications that arises from its remarkable redox catalytic properties. However, there is no understanding of how to control the formation of these two types of nanoceria to obtain Ce3+/Ce4+ ratios required in various applications. In this work, using a soluble borate glass, nanoceria with specific ratios of Ce3+/Ce4+ are created and extracted via controlled glass-melting parameters. Glass embedded with nanoceria as well as nanoceria extracted from the glass were studied via XANES and fitted with the Multivariate Curve Resolution (MCR) technique to calculate the ratio of Ce3+/Ce4+. Results show that mixed-valence nanoceria with specific ratios are hermetically sealed within the glass for long durations. When the glass dissolves, the mixed-valence nanoceria are released, and the extracted nanoceria have unchanged Ce3+/Ce4+ ratios. Furthermore, TEM investigation on released nanoceria show that the nanoceria consist of several different structures. Although nanocrystal structures of Ce7O12, Ce11O20, and Ce2O3 contribute to the reduced state, a new quasi-stable phase of CeO1.66 has been observed as well.

20.
J Phys Chem A ; 126(27): 4528-4536, 2022 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-35786965

RESUMO

G4MP2 theory has proven to be a reliable and accurate quantum chemical composite method for the calculation of molecular energies using an approximation based on second-order perturbation theory to lower computational costs compared to G4 theory. However, it has been found to have significantly increased errors when applied to larger organic molecules with 10 or more nonhydrogen atoms. We report here on an investigation of the cause of the failure of G4MP2 theory for such larger molecules. One source of error is found to be the "higher-level correction (HLC)", which is meant to correct for deficiencies in correlation contributions to the calculated energies. This is because the HLC assumes that the contribution is independent of the element and the type of bonding involved, both of which become more important with larger molecules. We address this problem by adding an atom-specific correction, dependent on atom type but not bond type, to the higher-level correction. We find that a G4MP2 method that incorporates this modification of the higher-level correction, referred to as G4MP2A, becomes as accurate as G4 theory (for computing enthalpies of formation) for a test set of molecules with less than 10 nonhydrogen atoms as well as a set with 10-14 such atoms, the set of molecules considered here, with a much lower computational cost. The G4MP2A method is also found to significantly improve ionization potentials and electron affinities. Finally, we implemented the G4MP2A energies in a machine learning method to predict molecular energies.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...