Search | VHL Regional Portal

Acceleration of Graph Neural Network-Based Prediction Models in Chemistry via Co-Design Optimization on Intelligence Processing Units.

Helal, Hatem; Firoz, Jesun; Bilbrey, Jenna A; Sprueill, Henry; Herman, Kristina M; Krell, Mario Michael; Murray, Tom; Roldan, Manuel Lopez; Kraus, Mike; Li, Ang; Das, Payel; Xantheas, Sotiris S; Choudhury, Sutanay.

J Chem Inf Model ; 64(5): 1568-1580, 2024 Mar 11.

Article in English | MEDLINE | ID: mdl-38382011

ABSTRACT

Atomic structure prediction and associated property calculations are the bedrock of chemical physics. Since high-fidelity ab initio modeling techniques for computing the structure and properties can be prohibitively expensive, this motivates the development of machine-learning (ML) models that make these predictions more efficiently. Training graph neural networks over large atomistic databases introduces unique computational challenges, such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs, such as social networks. We demonstrate a novel hardware-software codesign approach to scale up the training of atomistic graph neural networks (GNN) for structure and property prediction. First, to eliminate redundant computation and memory associated with alternative padding techniques and to improve throughput via minimizing communication, we formulate the effective coalescing of the batches of variable-size atomistic graphs as the bin packing problem and introduce a hardware-agnostic algorithm to pack these batches. In addition, we propose hardware-specific optimizations, including a planner and vectorization for the gather-scatter operations targeted for Graphcore's Intelligence Processing Unit (IPU), as well as model-specific optimizations such as merged communication collectives and optimized softplus. Putting these all together, we demonstrate the effectiveness of the proposed codesign approach by providing an implementation of a well-established atomistic GNN on the Graphcore IPUs. We evaluate the training performance on multiple atomistic graph databases with varying degrees of graph counts, sizes, and sparsity. We demonstrate that such a codesign approach can reduce the training time of atomistic GNNs and can improve their performance by up to 1.5× compared to the baseline implementation of the model on the IPUs. Additionally, we compare our IPU implementation with a Nvidia GPU-based implementation and show that our atomistic GNN implementation on the IPUs can run 1.8× faster on average compared to the execution time on the GPUs.

Subject(s)

Acceleration , Neural Networks, Computer , Algorithms , Communication , Intelligence

Preparing for the next pandemic via transfer learning from existing diseases with hierarchical multi-modal BERT: a study on COVID-19 outcome prediction.

Agarwal, Khushbu; Choudhury, Sutanay; Tipirneni, Sindhu; Mukherjee, Pritam; Ham, Colby; Tamang, Suzanne; Baker, Matthew; Tang, Siyi; Kocaman, Veysel; Gevaert, Olivier; Rallo, Robert; Reddy, Chandan K.

Sci Rep ; 12(1): 10748, 2022 06 24.

Article in English | MEDLINE | ID: mdl-35750878

ABSTRACT

Developing prediction models for emerging infectious diseases from relatively small numbers of cases is a critical need for improving pandemic preparedness. Using COVID-19 as an exemplar, we propose a transfer learning methodology for developing predictive models from multi-modal electronic healthcare records by leveraging information from more prevalent diseases with shared clinical characteristics. Our novel hierarchical, multi-modal model ([Formula: see text]) integrates baseline risk factors from the natural language processing of clinical notes at admission, time-series measurements of biomarkers obtained from laboratory tests, and discrete diagnostic, procedure and drug codes. We demonstrate the alignment of [Formula: see text]'s predictions with well-established clinical knowledge about COVID-19 through univariate and multivariate risk factor driven sub-cohort analysis. [Formula: see text]'s superior performance over state-of-the-art methods shows that leveraging patient data across modalities and transferring prior knowledge from similar disorders is critical for accurate prediction of patient outcomes, and this approach may serve as an important tool in the early response to future pandemics.

Subject(s)

COVID-19 , Pandemics , COVID-19/epidemiology , Humans , Machine Learning , Natural Language Processing , Prognosis

A look inside the black box: Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters.

Bilbrey, Jenna A; Heindel, Joseph P; Schram, Malachi; Bandyopadhyay, Pradipta; Xantheas, Sotiris S; Choudhury, Sutanay.

J Chem Phys ; 153(2): 024302, 2020 Jul 14.

Article in English | MEDLINE | ID: mdl-32668919

ABSTRACT

We describe a method for the post-hoc interpretation of a neural network (NN) trained on the global and local minima of neutral water clusters. We use the structures recently reported in a newly published database containing over 5 × 106 unique water cluster networks (H2O)N of size N = 3-30. The structural properties were first characterized using chemical descriptors derived from graph theory, identifying important trends in topology, connectivity, and polygon structure of the networks associated with the various minima. The code to generate the molecular graphs and compute the descriptors is available at https://github.com/exalearn/molecular-graph-descriptors, and the graphs are available alongside the original database at https://sites.uw.edu/wdbase/. A Continuous-Filter Convolutional Neural Network (CF-CNN) was trained on a subset of 500 000 networks to predict the potential energy, yielding a mean absolute error of 0.002 ± 0.002 kcal/mol per water molecule. Clusters of sizes not included in the training set exhibited errors of the same magnitude, indicating that the CF-CNN protocol accurately predicts energies of networks for both smaller and larger sizes than those used during training. The graph-theoretical descriptors were further employed to interpret the predictive power of the CF-CNN. Topological measures, such as the Wiener index, the average shortest path length, and the similarity index, suggested that all networks from the test set were within the range of values as the ones from the training set. The graph analysis suggests that larger errors appear when the mean degree and the number of polygons in the cluster lie further from the mean of the training set. This indicates that the structural space, and not just the chemical space, is an important factor to consider when designing training sets, as predictive errors can result when the structural composition is sufficiently different from the bulk of those in the training set. To this end, the developed descriptors are quite effective in explaining the results of the CF-CNN (a.k.a. the "black box") model.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL