text-generation-webui LLaMATokenizer does not exist or is not currently imported- LLaMA 4-bit

Hello, i followed the all instructions to install for the LLaMA 4-bit mode under the wiki but i'm getting this error

Loading llama-7b-hf... Loading model ... Done. Traceback (most recent call last):

File "/home/user/Desktop/oobabooga/text-generation-webui/server.py", line 194, in shared.model, shared.tokenizer = load_model(shared.model_name) File "/home/user/Desktop/oobabooga/text-generation-webui/modules/models.py", line 177, in load_model tokenizer = AutoTokenizer.from_pretrained(Path(f"models/{shared.model_name}/")) File "/home/user/Desktop/oobabooga/installer_files/env/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 676, in from_pretrained raise ValueError( ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

I put the llama-7b-4bit.pt in the models folder next too the llama-7b-hf folder.

Please let me know any other info you need.

I am on linux and i have a GTX 1660 Ti

Mar 11 '23 07:03 Programer2947693

Install the requirements from the Facebook Package:

# If you do not do this, you will get runtime errors about the LLaMATokenizer not being registered.
git clone https://github.com/facebookresearch/llama
pip install -r ./llama/requirements.txt

Mar 11 '23 11:03 jimtendo

Try uninstalling your existing transformers and installing again and see if that solves it:

pip uninstall transformers
pip install git+https://github.com/zphang/transformers@llama_push

@jimtendo we use the Hugging Face adaptation for the LLaMA model which is independent from the original Facebook implementation.

Mar 11 '23 13:03 oobabooga

@oobabooga unfortunately it still doesnt work, i'm getting the same traceback

Try uninstalling your existing transformers and installing again and see if that solves it:
pip uninstall transformers
pip install git+https://github.com/zphang/transformers@llama_push

Mar 11 '23 17:03 Programer2947693

It seems to be fixed after rerunning the the pip install requirements.txt again after the latest commit

i just put it in my start-webui.sh file temporarily to handle all the conda pathing. this is what it mine looked like.

INSTALL_ENV_DIR="$(pwd)/installer_files/env"
export PATH="$INSTALL_ENV_DIR/bin:$PATH"
CONDA_BASEPATH=$(conda info --base)
source "$CONDA_BASEPATH/etc/profile.d/conda.sh" # otherwise conda complains about 'shell not initialized' (needed when running in a script)

conda activate 
cd text-generation-webui
pip install -r "requirements.txt"
#python server.py --load-in-4bit --model llama-7b-hf

and then once it was done i deleted the pip install -r "requirements.txt" and uncommented the python server.py --load-in-4bit --model llama-7b-hf and now its working perfectly!

Mar 11 '23 19:03 Programer2947693

I followed this guide: https://github.com/underlines/awesome-marketing-datascience/blob/master/llama.md#windows-11-native

Unfortunately, I still get this very error, even after re-doing this thrice now.

TL;DR:

Install Conda, create environment textgen
Install all the basic packages into the env
install facebookresearch/llama/requirements.txt
install GPTQ-for-LLaMa via the provided .whl
un- and reinstall via zphang/transformers@llama_push

And this is what I get:

(textgen) PS C:\tools\text-generation-webui> python server.py --model llama-13b-hf --load-in-4bit
Warning: --load-in-4bit is deprecated and will be removed. Use --gptq-bits 4 instead.
Loading llama-13b-hf...
CUDA extension not installed.
Loading model ...
Done.
Traceback (most recent call last):
File "C:\tools\text-generation-webui\server.py", line 236, in <module>                                                                                                                                                                                         shared.model, shared.tokenizer = load_model(shared.model_name)                                                                                                                                                                                             File "C:\tools\text-generation-webui\modules\models.py", line 163, in load_model                                                                                                                                                                               tokenizer = AutoTokenizer.from_pretrained(Path(f"models/{shared.model_name}/"))                                                                                                                                                                            File "C:\Users\Ingwie Phoenix\miniconda3\envs\textgen\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 676, in from_pretrained
ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

Any other idea how I can fix this? Thanks!

(EDIT: Sorry for the mobid formatting, miniconda's chosen TTY emulator sucks x.x)

Mar 18 '23 22:03 IngwiePhoenix

@IngwiePhoenix you seem to have two problems there, missing cuda and the LLaMATokenizer error.

I found this guide to be really helpful : https://www.reddit.com/r/LocalLLaMA/comments/11o6o3f/how_to_install_llama_8bit_and_4bit/?utm_source=share&utm_medium=web2x&context=3

Including a comment straight after concerning LLaMATokenizer which I needed to apply. This helped me to get things working with a 4bit model.

Mar 19 '23 10:03 codermrrob

@IngwiePhoenix @codermrrob @Programer2947693 try to modiffy modles/llama-7b-hf/tokenizer_config.json
LLaMATokenizer -> LlamaTokenizer

Mar 20 '23 06:03 devinzhang91

text-generation-webui text-generation-webui copied to clipboard

LLaMATokenizer does not exist or is not currently imported- LLaMA 4-bit

text-generation-webui
text-generation-webui copied to clipboard