FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

FastChat - error on 4bit GPTQ

Open steppige opened this issue 1 year ago • 1 comments

Hi, since I updated fastchat to version 0.2.2 I can no longer make the 4-bit GPTQ work because I get this error:

python3 -m fastchat.serve.cli --model-path models/TheBloke_vicuna-7B-1.1-GPTQ-4bit-128g --wbits 4 --groupsize 128 usage: cli.py [-h] [--model-path MODEL_PATH] [--device {cpu,cuda,mps}] [--num-gpus NUM_GPUS] [--load-8bit] [--conv-template CONV_TEMPLATE] [--temperature TEMPERATURE] [--max-new-tokens MAX_NEW_TOKENS] [--style {simple,rich}] [--debug] cli.py: error: unrecognized arguments: --wbits 4 --groupsize 128

How can I fix this? Thank you bye!

steppige avatar Apr 16 '23 23:04 steppige

I don't think we have any official support for GPTQ-4bit. But I'll take a look at GPTQ this week and update on this issue.

zhisbug avatar May 08 '23 08:05 zhisbug

@zhisbug Hi, I make a new PR to address GPTQ-4bit, can you take a look and give some advice? Thanks! #1209

alanxmay avatar May 15 '23 11:05 alanxmay

@steppige is this still an issue?

surak avatar Oct 21 '23 15:10 surak