text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

transformers error loading Llama

Open Fenfel opened this issue 1 year ago • 11 comments

Describe the bug

After installing the new transformers webui does not load models changing the tokenizer did not help

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

python server.py --auto-devices --model LLaMA-13B --gptq-bits 4 --notebook

Screenshot

No response

Logs

(textgen) C:\Users\anton\Desktop\SD\text-generation-webui>python server.py --auto-devices --model LLaMA-13B --gptq-bits 4 --notebook
Loading LLaMA-13B...
CUDA extension not installed.
Traceback (most recent call last):
  File "C:\Users\anton\Desktop\SD\text-generation-webui\server.py", line 215, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
  File "C:\Users\anton\Desktop\SD\text-generation-webui\modules\models.py", line 95, in load_model
    model = load_quantized(model_name)
  File "C:\Users\anton\Desktop\SD\text-generation-webui\modules\GPTQ_loader.py", line 55, in load_quantized
    model = load_quant(str(path_to_model), str(pt_path), shared.args.gptq_bits)
  File "C:\Users\anton\Desktop\SD\text-generation-webui\repositories\GPTQ-for-LLaMa\llama.py", line 220, in load_quant
    from transformers import LLaMAConfig, LLaMAForCausalLM
ImportError: cannot import name 'LLaMAConfig' from 'transformers' (C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\transformers\__init__.py)

System Info

RTX 3070ti
Windows 10 home (not activated)
32GB Ram

Fenfel avatar Mar 17 '23 21:03 Fenfel

https://github.com/oobabooga/text-generation-webui/issues/322#issuecomment-1472624995

I had the same problem, you need to change the capitalization of the phrase "LLaMAConfig"

RandomInternetPreson avatar Mar 17 '23 21:03 RandomInternetPreson

#322 (comment)

I had the same problem, you need to change the capitalization of the phrase "LLaMAConfig"

You mean replace "LLaMAConfig" in C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\transformers_init_.py to "llamaConfig"?

Fenfel avatar Mar 17 '23 21:03 Fenfel

Nope, it's easier than that, go to your model folder where you have your llama model. Find tokenizer_config.json and change LLaMATokenizer to LlamaTokenizer

RandomInternetPreson avatar Mar 17 '23 21:03 RandomInternetPreson

Had the same problem. I solved it by pulling down the latest version of the GPTQ-for-LLaMa repo

cd repositories/GPTQ-for-LLaMa/
git pull

kinfolk0117 avatar Mar 17 '23 21:03 kinfolk0117

Had the same problem. I solved it by pulling down the latest version of the GPTQ-for-LLaMa repo

cd repositories/GPTQ-for-LLaMa/
git pull

tank you It's working but partially As you can easily guess, I think it is related to the inscription "CUDA extension not installed."

(textgen) C:\Users\anton\Desktop\SD\text-generation-webui>python server.py --auto-devices --model LLaMA-13B --gptq-bits 4 --cai-chat Loading LLaMA-13B... CUDA extension not installed. Loading model ... Traceback (most recent call last): File "C:\Users\anton\Desktop\SD\text-generation-webui\server.py", line 215, in shared.model, shared.tokenizer = load_model(shared.model_name) File "C:\Users\anton\Desktop\SD\text-generation-webui\modules\models.py", line 95, in load_model model = load_quantized(model_name) File "C:\Users\anton\Desktop\SD\text-generation-webui\modules\GPTQ_loader.py", line 55, in load_quantized model = load_quant(str(path_to_model), str(pt_path), shared.args.gptq_bits) File "C:\Users\anton\Desktop\SD\text-generation-webui\repositories\GPTQ-for-LLaMa\llama.py", line 245, in load_quant model.load_state_dict(torch.load(checkpoint)) File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 789, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1131, in _load result = unpickler.load() File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1101, in persistent_load load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1083, in load_tensor wrap_storage=restore_location(storage, location), File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 215, in default_restore_location result = fn(storage, location) File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize device = validate_cuda_device(location) File "C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I don't know what the problem is because conda tells me that everything is installed.

(textgen) C:\Users\anton\Desktop\SD\text-generation-webui>conda install cuda -c nvidia/label/cuda-11.3.0 -c nvidia/label/cuda-11.3.1 Collecting package metadata (current_repodata.json): done Solving environment: done

All requested packages already installed.

Fenfel avatar Mar 17 '23 22:03 Fenfel

Nope, it's easier than that, go to your model folder where you have your llama model. Find tokenizer_config.json and change LLaMATokenizer to LlamaTokenizer

In the beginning I wrote that I had already done it and that was not the problem.

Fenfel avatar Mar 17 '23 22:03 Fenfel

Do you have multiple models? I'd double check that you did it in the right place, etc. It's unlikely there's another root cause for the issue. Worst case scenario, search the entire repo (using VS Code, Notepad++, w/e) for the string with the wrong capitalization from

ImportError: cannot import name 'LLaMAConfig' from 'transformers' (C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\transformers\__init__.py)

to see where it's coming from.

VldmrB avatar Mar 17 '23 22:03 VldmrB

Do you have multiple models? I'd double check that you did it in the right place, etc. It's unlikely there's another root cause for the issue. Worst case scenario, search the entire repo (using VS Code, Notepad++, w/e) for the string with the wrong capitalization from

ImportError: cannot import name 'LLaMAConfig' from 'transformers' (C:\Users\anton\anaconda3\envs\textgen\lib\site-packages\transformers\__init__.py)

to see where it's coming from.

Fixed it by using

cd repositories/GPTQ-for-LLaMa/
git pull

but now the problem is different

conda install cuda -c nvidia/label/cuda-11.3.0 -c nvidia/label/cuda-11.3.1 tells me that everything is in place (look at the screenshot)

image

Fenfel avatar Mar 17 '23 22:03 Fenfel

Not sure then. I suspect it might be the cuda that you install here: https://github.com/oobabooga/text-generation-webui/pull/206

mkdir repositories
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa
cd GPTQ-for-LLaMa
python setup_cuda.py install

But I think I was able to do git pull on that yesterday, and not have it break anything... ¯\_(ツ)_/¯

VldmrB avatar Mar 17 '23 23:03 VldmrB

idk how to fix that

Fenfel avatar Mar 17 '23 23:03 Fenfel

@Fenfel I'd delete your environment, files, and start over. I've been able to get everything working correctly on Windows. I put some instructions here which may help.

xNul avatar Mar 19 '23 06:03 xNul

@Fenfel I'd delete your environment, files, and start over. I've been able to get everything working correctly on Windows. I put some instructions here which may help.

Thanks, everything works but an error appears at startup that does not affect the model, so okay

Fenfel avatar Mar 19 '23 19:03 Fenfel

@Fenfel I'd delete your environment, files, and start over. I've been able to get everything working correctly on Windows. I put some instructions here which may help.

Thanks, everything works but an error appears at startup that does not affect the model, so okay

Oh, this error?

UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)

I get it too. I guess it will be fixed in a future update of the transformers library.

xNul avatar Mar 21 '23 13:03 xNul