FastChat
FastChat copied to clipboard
Support for GPTQ-LLAMA
Hi,
Can I use GPTQ quantized model to do the inference?
https://github.com/qwopqwop200/GPTQ-for-LLaMa
Thank you!
For now, I think you can try to get GPTQ vicuna in other ecosystems like GPT4all.