FastChat CUDA error with parameter --num-gpus 2

GPUs: 2 x RTX 4090 24G run below command: python3 -m fastchat.serve.cli --model-path ~/vicuna-13b-1_1-hf/ --num-gpus 2 All it's ok before i input prompt, I tried to input "hello", get below error:

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

How to fix it?

May 20 '23 17:05 Fjallraven-hc

I have the same setup and am seeing the same error

Jun 03 '23 21:06 mingfang

I was able to see more detailed errors when using the cli

root@epyc:~/FastChat# python3 -m fastchat.serve.cli --model-path /root/FastChat/vicuna-13b/ --num-gpus 2
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 3/3 [00:11<00:00,  3.71s/it]
USER: What if the Internet had been invented during the Renaissance period
ASSISTANT: iques местventory►iour goals норedesuce DouglasSito "% ChristophHD ../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [96,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [97,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [98,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [99,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.

...repeats many times...

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha,
a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)`

Jun 03 '23 22:06 mingfang

same problem，anybody help？

Jun 13 '23 10:06 RogerYu123

same problem here

Jun 19 '23 07:06 ecfm

same

Sep 11 '23 03:09 LeoCeasar

This looks like a failed installation of pytorch/cuda/cublas. Please make a new virtual environment and reinstall the packages there again.

Oct 23 '23 09:10 surak

FastChat FastChat copied to clipboard

CUDA error with parameter --num-gpus 2

FastChat
FastChat copied to clipboard