FastChat impossible to load vicuna-13B-1.1-GPTQ-4bit-128g

hello,

i can't load the model vicuna-13B-1.1-GPTQ-4bit-128g . i have installed git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda and python 3.9

thanks for your help.

python -m fastchat.serve.cli --model-name TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g --wbits 4 --groupsize 128

Loading GPTQ quantized model...
Loading model ...
Traceback (most recent call last):
  File "/home/ryzen/AI/miniconda3/envs/vicuna/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ryzen/AI/miniconda3/envs/vicuna/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ryzen/AI/FastChat/fastchat/serve/cli.py", line 162, in <module>
    main(args)
  File "/home/ryzen/AI/FastChat/fastchat/serve/cli.py", line 102, in main
    model = load_quantized(model_name)
  File "/home/ryzen/AI/FastChat/fastchat/serve/load_gptq_model.py", line 62, in load_quantized
    model = load_quant(str(path_to_model), str(pt_path), wbits, groupsize, kernel_switch_threshold=threshold)
  File "/home/ryzen/AI/FastChat/fastchat/serve/load_gptq_model.py", line 37, in load_quant
    model.load_state_dict(safe_load(checkpoint))
  File "/home/ryzen/AI/miniconda3/envs/vicuna/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
	Missing key(s) in state_dict: "model.layers.0.self_attn.k_proj.bias", "model.layers.0.self_attn.o_proj.bias", .......`

May 24 '23 08:05 gandolfi974

same here

May 27 '23 14:05 chengyanwu

similar problem

Jun 15 '23 15:06 Trangle

As a temporary solution, you can convert the GPTQ 4bit model locally. I will test compatibility with other models released by TheBloke

Jun 17 '23 05:06 alanxmay

Well @alanxmay look who showed up to this party! LOL. Now that #2050 is resolved here I am too. I found the following which I believe is a clue to this issue but this is way beyond my current ability to understand.

https://discuss.pytorch.org/t/solved-keyerror-unexpected-key-module-encoder-embedding-weight-in-state-dict/1686/2

Jul 24 '23 18:07 karnival800

@gandolfi974 did you ever fix this?

Jul 25 '23 14:07 karnival800

@gandolfi974 did you ever fix this?

No, i use oobabooga/text-generation-webui:

Jul 25 '23 16:07 gandolfi974

As the OP moved on, I will close this one. If anyone feels like this is not a good solution, please reopen.

Oct 23 '23 08:10 surak

FastChat FastChat copied to clipboard

impossible to load vicuna-13B-1.1-GPTQ-4bit-128g

FastChat
FastChat copied to clipboard