GPTQ-triton icon indicating copy to clipboard operation
GPTQ-triton copied to clipboard

Needs more VRAM than normal GPTQ CUDA version?

Open DanielWe2 opened this issue 1 year ago • 3 comments

Thanks, I wanted to try your triton version. But I only have 8 GB RAM.

The GPTQ Cuda versions works (7B model). Your version (the ppl script) crashes with CUDA OOM).

Is that to be expected or can that be solved?

DanielWe2 avatar Mar 28 '23 19:03 DanielWe2