GPTQ-for-LLaMa CUDA error: unknown error (Error when quantize llama Model)

CUDA error: unknown error (Error when quantize llama Model)

Open ostix360 opened this issue 1 year ago • 1 comments

My config:

WSL2 on window 10, GPU -> NVIDIA 1660 super torch 2.0 installed

the MODEL_DIR point to a 13B llama model hf type folder (it's Vicuna) When I run CUDA_VISIBLE_DEVICES=0 CUDA_LAUNCH_BLOCKING=1 python llama.py ${MODEL_DIR} c4 --wbits 4 -- true-sequential --act-order --groupsize 128 --save llama7b-4bit-128g.pt (The CUDA_LAUNCH_BLOCKING=1 is for debugging) I got :

Starting ...
Ready.
Traceback (most recent call last):
  File "/mnt/d/DataScience/GPTQ-for-LLaMa/llama.py", line 452, in <module>
    quantizers = llama_sequential(model, dataloader, DEV)
  File "/home/ostix/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/d/DataScience/GPTQ-for-LLaMa/llama.py", line 72, in llama_sequential
    layer = layers[i].to(dev)
  File "/home/ostix/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
  File "/home/ostix/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/ostix/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/ostix/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
  File "/home/ostix/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: unknown error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

I tried with the Cuda and triton branch and it's the same My be my problem is with vicuna and not with this rep Thanks for your help

Apr 24 '23 13:04 ostix360

Even with the 7B llama model I have the same error

Apr 24 '23 17:04 ostix360

GPTQ-for-LLaMa GPTQ-for-LLaMa copied to clipboard

CUDA error: unknown error (Error when quantize llama Model)

GPTQ-for-LLaMa
GPTQ-for-LLaMa copied to clipboard