FastChat
FastChat copied to clipboard
CUDA error with parameter --num-gpus 2
GPUs: 2 x RTX 4090 24G run below command: python3 -m fastchat.serve.cli --model-path ~/vicuna-13b-1_1-hf/ --num-gpus 2 All it's ok before i input prompt, I tried to input "hello", get below error:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
How to fix it?
I have the same setup and am seeing the same error
I was able to see more detailed errors when using the cli
root@epyc:~/FastChat# python3 -m fastchat.serve.cli --model-path /root/FastChat/vicuna-13b/ --num-gpus 2
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 3/3 [00:11<00:00, 3.71s/it]
USER: What if the Internet had been invented during the Renaissance period
ASSISTANT: iques местventory►iour goals норedesuce DouglasSito "% ChristophHD ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [96,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [97,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [98,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [99,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
...repeats many times...
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha,
a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)`
same problem,anybody help?
same problem here
same
This looks like a failed installation of pytorch/cuda/cublas. Please make a new virtual environment and reinstall the packages there again.