Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
J Chem Theory Comput ; 20(6): 2505-2519, 2024 Mar 26.
Article in English | MEDLINE | ID: mdl-38456899

ABSTRACT

This article presents a novel algorithm for the calculation of analytic energy gradients from second-order Møller-Plesset perturbation theory within the Resolution-of-the-Identity approximation (RI-MP2), which is designed to achieve high performance on clusters with multiple graphical processing units (GPUs). The algorithm uses GPUs for all major steps of the calculation, including integral generation, formation of all required intermediate tensors, solution of the Z-vector equation and gradient accumulation. The implementation in the EXtreme Scale Electronic Structure System (EXESS) software package includes a tailored, highly efficient, multistream scheduling system to hide CPU-GPU data transfer latencies and allows nodes with 8 A100 GPUs to operate at over 80% of theoretical peak floating-point performance. Comparative performance analysis shows a significant reduction in computational time relative to traditional multicore CPU-based methods, with our approach achieving up to a 95-fold speedup over the single-node performance of established software such as Q-Chem and ORCA. Additionally, we demonstrate that pairing our implementation with the molecular fragmentation framework in EXESS can drastically lower the computational scaling of RI-MP2 gradient calculations from quintic to subquadratic, enabling further substantial savings in runtime while retaining high numerical accuracy in the resulting gradients.

2.
J Chem Theory Comput ; 19(20): 7031-7055, 2023 Oct 24.
Article in English | MEDLINE | ID: mdl-37793073

ABSTRACT

The primary focus of GAMESS over the last 5 years has been the development of new high-performance codes that are able to take effective and efficient advantage of the most advanced computer architectures, both CPU and accelerators. These efforts include employing density fitting and fragmentation methods to reduce the high scaling of well-correlated (e.g., coupled-cluster) methods as well as developing novel codes that can take optimal advantage of graphical processing units and other modern accelerators. Because accurate wave functions can be very complex, an important new functionality in GAMESS is the quasi-atomic orbital analysis, an unbiased approach to the understanding of covalent bonds embedded in the wave function. Best practices for the maintenance and distribution of GAMESS are also discussed.

3.
Bioinformatics ; 39(9)2023 09 02.
Article in English | MEDLINE | ID: mdl-37656933

ABSTRACT

MOTIVATION: Sequence simulation plays a vital role in phylogenetics with many applications, such as evaluating phylogenetic methods, testing hypotheses, and generating training data for machine-learning applications. We recently introduced a new simulator for multiple sequence alignments called AliSim, which outperformed existing tools. However, with the increasing demands of simulating large data sets, AliSim is still slow due to its sequential implementation; for example, to simulate millions of sequence alignments, AliSim took several days or weeks. Parallelization has been used for many phylogenetic inference methods but not yet for sequence simulation. RESULTS: This paper introduces AliSim-HPC, which, for the first time, employs high-performance computing for phylogenetic simulations. AliSim-HPC parallelizes the simulation process at both multi-core and multi-CPU levels using the OpenMP and message passing interface (MPI) libraries, respectively. AliSim-HPC is highly efficient and scalable, which reduces the runtime to simulate 100 large gap-free alignments (30 000 sequences of one million sites) from over one day to 11 min using 256 CPU cores from a cluster with six computing nodes, a 153-fold speedup. While the OpenMP version can only simulate gap-free alignments, the MPI version supports insertion-deletion models like the sequential AliSim. AVAILABILITY AND IMPLEMENTATION: AliSim-HPC is open-source and available as part of the new IQ-TREE version v2.2.3 at https://github.com/iqtree/iqtree2/releases with a user manual at http://www.iqtree.org/doc/AliSim.


Subject(s)
Computing Methodologies , Software , Phylogeny , Computer Simulation , Sequence Alignment
4.
J Chem Phys ; 159(4)2023 Jul 28.
Article in English | MEDLINE | ID: mdl-37497819

ABSTRACT

Electronic structure calculations have the potential to predict key matter transformations for applications of strategic technological importance, from drug discovery to material science and catalysis. However, a predictive physicochemical characterization of these processes often requires accurate quantum chemical modeling of complex molecular systems with hundreds to thousands of atoms. Due to the computationally demanding nature of electronic structure calculations and the complexity of modern high-performance computing hardware, quantum chemistry software has historically failed to operate at such large molecular scales with accuracy and speed that are useful in practice. In this paper, novel algorithms and software are presented that enable extreme-scale quantum chemistry capabilities with particular emphasis on exascale calculations. This includes the development and application of the multi-Graphics Processing Unit (GPU) library LibCChem 2.0 as part of the General Atomic and Molecular Electronic Structure System package and of the standalone Extreme-scale Electronic Structure System (EXESS), designed from the ground up for scaling on thousands of GPUs to perform high-performance accurate quantum chemistry calculations at unprecedented speed and molecular scales. Among various results, we report that the EXESS implementation enables Hartree-Fock/cc-pVDZ plus RI-MP2/cc-pVDZ/cc-pVDZ-RIFIT calculations on an ionic liquid system with 623 016 electrons and 146 592 atoms in less than 45 min using 27 600 GPUs on the Summit supercomputer with a 94.6% parallel efficiency.

5.
J Chem Theory Comput ; 18(7): 4164-4176, 2022 Jul 12.
Article in English | MEDLINE | ID: mdl-35748512

ABSTRACT

As computer systems dedicated to scientific calculations become massively parallel, the poor parallel performance of the Fock matrix diagonalization becomes a major impediment to achieving larger molecular sizes in self-consistent field (SCF) calculations. In this Article, a novel, highly parallel, and diagonalization-free algorithm for the accelerated convergence of the SCF procedure is presented. The algorithm, called Q-Next, draws on the second-order SCF, quadratically convergent SCF, and direct inversion of the iterative subspace (DIIS) approaches to enable fast convergence while replacing the Fock matrix diagonalization SCF bottleneck with higher parallel efficiency matrix multiplications. Performance results on both parallel multicore CPU and GPU hardware for a variety of test molecules and basis sets are presented, showing that Q-Next achieves a convergence rate comparable to the DIIS method while being, on average, one order of magnitude faster.

6.
J Chem Theory Comput ; 17(12): 7486-7503, 2021 Dec 14.
Article in English | MEDLINE | ID: mdl-34780186

ABSTRACT

A novel implementation of the self-consistent field (SCF) procedure specifically designed for high-performance execution on multiple graphics processing units (GPUs) is presented. The algorithm offloads to GPUs the three major computational stages of the SCF, namely, the calculation of one-electron integrals, the calculation and digestion of electron repulsion integrals, and the diagonalization of the Fock matrix, including SCF acceleration via DIIS. Performance results for a variety of test molecules and basis sets show remarkable speedups with respect to the state-of-the-art parallel GAMESS CPU code and relative to other widely used GPU codes for both single and multi-GPU execution. The new code outperforms all existing multi-GPU implementations when using eight V100 GPUs, with speedups relative to Terachem ranging from 1.2× to 3.3× and speedups of up to 28× over QUICK on one GPU and 15× using eight GPUs. Strong scaling calculations show nearly ideal scalability up to 8 GPUs while retaining high parallel efficiency for up to 18 GPUs.

7.
J Chem Theory Comput ; 16(12): 7232-7238, 2020 Dec 08.
Article in English | MEDLINE | ID: mdl-33206515

ABSTRACT

We present a high-performance, GPU (graphics processing unit)-accelerated algorithm for building the Fock matrix. The algorithm is designed for efficient calculations on large molecular systems and uses a novel dynamic load balancing scheme that maximizes the GPU throughput and avoids thread divergence that could occur due to integral screening. Additionally, the code adopts a novel ERI digestion algorithm that exploits all forms of permutational symmetry, combines efficiently the evaluation of both Coulomb and exchange terms together, and eliminates explicit thread synchronization requirements. Performance results obtained using a number of large molecules reveal remarkable speedups up to 24.4× with respect to the QUICK GPU code and up to 237× with respect to the GAMESS CPU parallel code.

8.
J Phys Chem A ; 124(23): 4557-4582, 2020 Jun 11.
Article in English | MEDLINE | ID: mdl-32379450

ABSTRACT

Electronic structure theory (especially quantum chemistry) has thrived and has become increasingly relevant to a broad spectrum of scientific endeavors as the sophistication of both computer architectures and software engineering has advanced. This article provides a brief history of advances in both hardware and software, from the early days of IBM mainframes to the current emphasis on accelerators and modern programming practices.

9.
J Chem Phys ; 152(15): 154102, 2020 Apr 21.
Article in English | MEDLINE | ID: mdl-32321259

ABSTRACT

A discussion of many of the recently implemented features of GAMESS (General Atomic and Molecular Electronic Structure System) and LibCChem (the C++ CPU/GPU library associated with GAMESS) is presented. These features include fragmentation methods such as the fragment molecular orbital, effective fragment potential and effective fragment molecular orbital methods, hybrid MPI/OpenMP approaches to Hartree-Fock, and resolution of the identity second order perturbation theory. Many new coupled cluster theory methods have been implemented in GAMESS, as have multiple levels of density functional/tight binding theory. The role of accelerators, especially graphical processing units, is discussed in the context of the new features of LibCChem, as it is the associated problem of power consumption as the power of computers increases dramatically. The process by which a complex program suite such as GAMESS is maintained and developed is considered. Future developments are briefly summarized.

10.
J Chem Theory Comput ; 16(3): 1568-1577, 2020 Mar 10.
Article in English | MEDLINE | ID: mdl-31972086

ABSTRACT

We present a quadrature-based algorithm for computing the opposite-spin component of the MP2 correlation energy which scales quadratically with basis set size and is well-suited to large-scale parallelization. The key ideas, which are rooted in the earlier work of Hirata and co-workers, are to abandon all two-electron integrals, recast the energy as a seven-dimensional integral, approximate that integral by quadrature, and employ a cutoff strategy to minimize the number of intermediate quantities. We discuss our implementation in detail and show that it parallelizes almost perfectly on 840 cores for cyclosporine (a molecule with roughly 200 atoms), exhibits [Formula: see text] scaling for a sequence of polyglycines, and is principally limited by the accuracy of its quadrature.

11.
J Chem Theory Comput ; 14(3): 1501-1509, 2018 Mar 13.
Article in English | MEDLINE | ID: mdl-29444408

ABSTRACT

We present a single-determinant approach to three challenging topics in the chemistry of excited states: double excitations, charge-transfer states, and conical intersections. The results are obtained by using the Initial Maximum Overlap Method (IMOM) which is a modified version of the Maximum Overlap Method (MOM). The new algorithm converges better than the original, especially for these difficult problems. By considering several case studies, we show that a single-determinant framework provides a simple and accurate alternative for modeling excited states in cases where other low-cost methods, such as CIS and TD-DFT, either perform poorly or fail completely.

12.
J Phys Chem A ; 122(11): 3066-3075, 2018 Mar 22.
Article in English | MEDLINE | ID: mdl-29465999

ABSTRACT

Effective core potential (ECP) integrals are among the most difficult one-electron integrals to calculate due to the projection operators. The radial part of these operators may include r0, r-1, and r-2 terms. For the r0 terms, we exploit a simple analytic expression for the fundamental projected integral to derive new recurrence relations and upper bounds for ECP integrals. For the r-1 and r-2 terms, we present a reconstruction method that replaces these terms by a sum of r0 terms and show that the resulting errors are chemically insignificant for a range of molecular properties. The new algorithm is available in Q-Chem 5.0 and is significantly faster than the ECP implementations in Q-Chem 4.4, GAMESS (US) and Dalton 2016.

13.
J Chem Theory Comput ; 14(1): 9-13, 2018 Jan 09.
Article in English | MEDLINE | ID: mdl-29272122

ABSTRACT

How many electrons are excited in an electronic transition? In this Letter, we introduce the excitation number η to answer this question when the initial and final states are each modeled by a single-determinant wave function. We show that calculated η values lie close to positive integers, leading to unambiguous assignments of the number of excited electrons. This contrasts with previous definitions of excitation quantities which can lead to mis-assignments. We consider several examples where η provides improved excited-state characterizations.

14.
J Chem Phys ; 147(2): 024103, 2017 Jul 14.
Article in English | MEDLINE | ID: mdl-28711054

ABSTRACT

We report the three main ingredients to calculate three- and four-electron integrals over Gaussian basis functions involving Gaussian geminal operators: fundamental integrals, upper bounds, and recurrence relations. In particular, we consider the three- and four-electron integrals that may arise in explicitly correlated F12 methods. A straightforward method to obtain the fundamental integrals is given. We derive vertical, transfer, and horizontal recurrence relations to build up angular momentum over the centers. Strong, simple, and scaling-consistent upper bounds are also reported. This latest ingredient allows us to compute only the O(N2) significant three- and four-electron integrals, avoiding the computation of the very large number of negligible integrals.

15.
J Chem Theory Comput ; 12(10): 4915-4924, 2016 Oct 11.
Article in English | MEDLINE | ID: mdl-27598837

ABSTRACT

The evaluation of contracted two-electron integrals over a Gaussian geminal operator is pivotal to diverse quantum chemistry methods. In this article, using the unique factorization properties and the sparsity of these integrals, a novel, near-optimal computation algorithm is presented. Our method employs a combination of recently developed upper bounds, recurrence relations in the spirit of the Head-Gordon-Pople approach, and late- and early-contraction paths in the PRISM style. A detailed study of the FLOP (floating-point operations) cost reveals that the new algorithm is computationally much cheaper than any other previous scheme.

16.
J Chem Theory Comput ; 12(4): 1735-40, 2016 Apr 12.
Article in English | MEDLINE | ID: mdl-26981747

ABSTRACT

Explicitly correlated F12 methods are becoming the first choice for high-accuracy molecular orbital calculations and can often achieve chemical accuracy with relatively small Gaussian basis sets. In most calculations, the many three- and four-electron integrals that formally appear in the theory are avoided through judicious use of resolutions of the identity (RI). However, for the intrinsic accuracy of the F12 wave function to not be jeopardized, the associated RI auxiliary basis set must be large. Here, inspired by the Head-Gordon-Pople and PRISM algorithms for two-electron integrals, we present an algorithm to directly compute three-electron integrals over Gaussian basis functions and a very general class of three-electron operators without invoking RI approximations. A general methodology to derive vertical, transfer, and horizontal recurrence relations is also presented.

17.
J Chem Phys ; 141(11): 111104, 2014 Sep 21.
Article in English | MEDLINE | ID: mdl-25240338

ABSTRACT

Hartree-Fock (HF) theory is most often applied to study the electronic ground states of molecular systems. However, with the advent of numerical techniques for locating higher solutions of the self-consistent field equations, it is now possible to examine the extent to which such mean-field solutions are useful approximations to electronic excited states. In this Communication, we use the maximum overlap method to locate 11 low-energy solutions of the HF equation for the H2 molecule and we find that, with only one exception, these yield surprisingly accurate models for the low-lying excited states of this molecule. This finding suggests that the HF solutions could be useful first-order approximations for correlated excited state wavefunctions.

SELECTION OF CITATIONS
SEARCH DETAIL
...