Search | VHL Regional Portal

Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments.

Zaverkin, Viktor; Holzmüller, David; Steinwart, Ingo; Kästner, Johannes.

J Chem Theory Comput ; 17(10): 6658-6670, 2021 Oct 12.

Article in English | MEDLINE | ID: mdl-34585927

ABSTRACT

Artificial neural networks (NNs) are one of the most frequently used machine learning approaches to construct interatomic potentials and enable efficient large-scale atomistic simulations with almost ab initio accuracy. However, the simultaneous training of NNs on energies and forces, which are a prerequisite for, e.g., molecular dynamics simulations, can be demanding. In this work, we present an improved NN architecture based on the previous GM-NN model [Zaverkin V.; Kästner, J. J. Chem. Theory Comput. 2020, 16, 5410-5421], which shows an improved prediction accuracy and considerably reduced training times. Moreover, we extend the applicability of Gaussian moment-based interatomic potentials to periodic systems and demonstrate the overall excellent transferability and robustness of the respective models. The fast training by the improved methodology is a prerequisite for training-heavy workflows such as active learning or learning-on-the-fly.

Learning Theory Estimates with Observations from General Stationary Stochastic Processes.

Hang, Hanyuan; Feng, Yunlong; Steinwart, Ingo; Suykens, Johan A K.

Neural Comput ; 28(12): 2853-2889, 2016 12.

Article in English | MEDLINE | ID: mdl-27391677

ABSTRACT

This letter investigates the supervised learning problem with observations drawn from certain general stationary stochastic processes. Here by general, we mean that many stationary stochastic processes can be included. We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established. The obtained oracle inequality is then applied to derive convergence rates for several learning schemes such as empirical risk minimization (ERM), least squares support vector machines (LS-SVMs) using given generic kernels, and SVMs using gaussian kernels for both least squares and quantile regression. It turns out that for independent and identically distributed (i.i.d.) processes, our learning rates for ERM recover the optimal rates. For non-i.i.d. processes, including geometrically [Formula: see text]-mixing Markov processes, geometrically [Formula: see text]-mixing processes with restricted decay, [Formula: see text]-mixing processes, and (time-reversed) geometrically [Formula: see text]-mixing processes, our learning rates for SVMs with gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates. For the remaining cases, our rates are at least close to the optimal rates. As a by-product, the assumed generalized Bernstein-type inequality also provides an interpretation of the so-called effective number of observations for various mixing processes.

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL