FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Support for GPTQ-LLAMA

Open MatthewCYM opened this issue 2 years ago • 1 comments

Hi,

Can I use GPTQ quantized model to do the inference?

https://github.com/qwopqwop200/GPTQ-for-LLaMa

Thank you!

MatthewCYM avatar Apr 16 '23 17:04 MatthewCYM

For now, I think you can try to get GPTQ vicuna in other ecosystems like GPT4all.

zhisbug avatar May 08 '23 08:05 zhisbug