Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add more filters










Database
Language
Publication year range
1.
J Chem Theory Comput ; 20(3): 1130-1142, 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38306601

ABSTRACT

In this work, we benchmark several Python routines for time and memory requirements to identify the optimal choice of the tensor contraction operations available. We scrutinize how to accelerate the bottleneck tensor operations of Pythonic coupled-cluster implementations in the Cholesky linear algebra domain, utilizing a NVIDIA Tesla V100S PCIe 32GB (rev 1a) graphics processing unit (GPU). The NVIDIA compute unified device architecture API interacts with CuPy, an open-source library for Python, designed as a NumPy drop-in replacement for GPUs. Due to the limitations of video memory, the GPU calculations must be performed batch-wise. Timing results of some contractions containing large tensors are presented. The CuPy implementation leads to a factor of 10-16 speed-up of the bottleneck tensor contractions compared to computations on 36 central processing unit (CPU) cores. Finally, we compare example CCSD and pCCD-LCCSD calculations performed solely on CPUs to their CPU-GPU hybrid implementation, which leads to a speed-up of a factor of 3-4 compared to the CPU-only variant.

SELECTION OF CITATIONS
SEARCH DETAIL
...