gpt-fast
gpt-fast copied to clipboard
RuntimeError: CUDA error: named symbol not found
I am trying to quantize the llama-2-7b-chat-hf using the gpt fast using:- python quantize.py --mode int4 --groupsize 32 on Kaggle using Kaggle T4*2 GPU. I have installed pytorch nightly using:- pip install torch==2.3.0.dev20240117+cu121 --index-url https://download.pytorch.org/whl/nightly/cu121
I had even tried changing dtype from torch.bfloat16 to torch.flat32. But got the same error again.
However, I get this error message:-
Loading model ...
Quantizing model weights for int4 weight-only affine per-channel groupwise quantization
linear: layers.0.attention.wqkv, in=4096, out=12288
linear: layers.0.attention.wo, in=4096, out=4096
Traceback (most recent call last):
File "/kaggle/working/quantize.py", line 605, in TORCH_USE_CUDA_DSA to enable device-side assertions.
Did u figure out this question? I also has this question
Did you figure out this question? I also has this question
Did you figure out this question? I also has this question
try another gpu