alpaca-lora icon indicating copy to clipboard operation
alpaca-lora copied to clipboard

cuBLAS API failed with status 15 - Error

Open rmivdc opened this issue 1 year ago • 27 comments

Hi, During the finetune.py command launch i'm encoutering this error titled above. i'm using Fedora 36 with Cuda12, Python 3.10.10, initializing seems begining like so :

CUDA SETUP: CUDA runtime path found: /usr/local/cuda-12.0/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.6 CUDA SETUP: Detected CUDA version 120

and then later after loading some files :

Loading cached split indices for dataset at /home/rmivdc/.cache/huggingface/datasets/json/default-fac87d4e05e14783/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51/cache-e521db28b6879419.arrow and /home/rmivdc/.cache/huggingface/datasets/json/default-fac87d4e05e14783/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51/cache-eb712e2459ca28b6.arrow /home/rmivdc/.local/lib/python3.10/site-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning warnings.warn( 0%| | 0/1170 [00:00<?, ?it/s]cuBLAS API failed with status 15 A: torch.Size([2048, 4096]), B: torch.Size([4096, 4096]), C: (2048, 4096); (lda, ldb, ldc): (c_int(65536), c_int(131072), c_int(65536)); (m, n, k): (c_int(2048), c_int(4096), c_int(4096))

am i using some wrong libs versions ? thx for your help

rmivdc avatar Mar 26 '23 20:03 rmivdc