gpu4pyscf
gpu4pyscf copied to clipboard
CUBLAS_STATUS_EXECUTION_FAILED
Hi, I am trying to replicate the example https://github.com/pyscf/gpu4pyscf/blob/master/examples/00-h2o.py using a benzene molecule instead of water and I am obtaining the same error as replicating the https://github.com/pyscf/gpu4pyscf/blob/master/examples/07-transition_state.py example with the molecule define in that file.
The error that I obtain is: ######################################################################################### Traceback (most recent call last): File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df_jk.py", line 63, in init_workflow rks.initialize_grids(mf, mf.mol, dm0) File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/rks.py", line 83, in initialize_grids ks.grids = prune_small_rho_grids_(ks, ks.mol, dm, ks.grids) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/rks.py", line 39, in prune_small_rho_grids_ rho = ks._numint.get_rho(mol, dm, grids, ks.max_memory) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/numint.py", line 721, in get_rho rho[p0:p1] = eval_rho2(mol, ao_mask, mo_coeff_mask, mo_occ, None, 'LDA', with_lapl) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/numint.py", line 200, in eval_rho2 c0 = _dot_ao_dm(mol, ao, cpos, non0tab, shls_slice, ao_loc) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/numint.py", line 1476, in _dot_ao_dm return cupy.dot(dm.T, ao) ^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/cupy/linalg/_product.py", line 63, in dot return a.dot(b, out) ^^^^^^^^^^^^^ File "cupy/_core/core.pyx", line 1757, in cupy._core.core._ndarray_base.dot File "cupy/_core/_routines_linalg.pyx", line 536, in cupy._core._routines_linalg.dot File "cupy/_core/_routines_linalg.pyx", line 626, in cupy._core._routines_linalg.tensordot_core File "cupy/_core/_routines_linalg.pyx", line 763, in cupy._core._routines_linalg.tensordot_core_v11 File "cupy_backends/cuda/libs/cublas.pyx", line 1426, in cupy_backends.cuda.libs.cublas.gemmEx File "cupy_backends/cuda/libs/cublas.pyx", line 1454, in cupy_backends.cuda.libs.cublas.gemmEx File "cupy_backends/cuda/libs/cublas.pyx", line 438, in cupy_backends.cuda.libs.cublas.check_status cupy_backends.cuda.libs.cublas.CUBLASError: CUBLAS_STATUS_NOT_INITIALIZED
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/pyscf/lib/misc.py", line 1104, in exit handler.result() File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/concurrent/futures/_base.py", line 456, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df_jk.py", line 43, in build_df mf.with_df.build() File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df.py", line 90, in build self._cderi = cholesky_eri_gpu(intopt, mol, auxmol, self.cd_low, omega=omega) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df.py", line 265, in cholesky_eri_gpu cderi_block = solve_triangular(cd_low, ints_slices, lower=True, overwrite_b=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/cupyx/scipy/linalg/_solve_triangular.py", line 97, in solve_triangular trsm( File "cupy_backends/cuda/libs/cublas.pyx", line 1109, in cupy_backends.cuda.libs.cublas.dtrsm File "cupy_backends/cuda/libs/cublas.pyx", line 1119, in cupy_backends.cuda.libs.cublas.dtrsm File "cupy_backends/cuda/libs/cublas.pyx", line 438, in cupy_backends.cuda.libs.cublas.check_status cupy_backends.cuda.libs.cublas.CUBLASError: CUBLAS_STATUS_EXECUTION_FAILED
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/soralakers96/CODE/gpu4pyscf/gpu4pyscf/examples/07-transition_state.py", line 68, in
I am using NVIDIA L40 with the pre-compiled version pip3 install gpu4pyscf-cuda12x.
It seems that CuPy didn't find cuBLAS. Can you make sure CUDA Toolkit is installed in your system? If installed, you can check out if cupy.dot works properly.
CUDA Toolkit is installed. When I run nvcc --version I obtain:
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Fri_Nov__3_17:16:49_PDT_2023 Cuda compilation tools, release 12.3, V12.3.103 Build cuda_12.3.r12.3/compiler.33492891_0
I have also tried cupy.dot with a toy example and it works.
@GiacomoDG96 OK, great. Possibly, GPU doesn't have enough space for cublas handle. Can you try to limit CuPy memory pool? https://docs.cupy.dev/en/stable/user_guide/memory.html#limiting-gpu-memory-usage