GPTQ-for-LLaMa
GPTQ-for-LLaMa copied to clipboard
Porting GPTQ to CPU?
Is it possible to run GPTQ on a machine that has only CPUs? If not, is there a plan for it?
You can use a GPTQ quantized model with llama.cpp by using this conversion script I believe.
just quant model on CPU?