FastChat
FastChat copied to clipboard
Trying to load a safetensors file
Getting this error:
Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models/.
cmd:
python3 -m fastchat.serve.cli --model-path models/ --num-gpus 4
You can use commands like below:
python3 -m fastchat.serve.model_worker
--model-path models/vicuna-7B-1.1-GPTQ-4bit-128g
--gptq-ckpt models/vicuna-7B-1.1-GPTQ-4bit-128g/vicuna-7B-1.1-GPTQ-4bit-128g.safetensors
--gptq-wbits 4
--gptq-groupsize 128
--gptq-act-order
To ^^ this need to install: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/fastest-inference-4bit
Whole thing is here, still not merged:
https://github.com/alanxmay/FastChat/tree/fastest-gptq-4bit-support