text-generation-webui
text-generation-webui copied to clipboard
LLaMATokenizer does not exist or is not currently imported- LLaMA 4-bit
Hello, i followed the all instructions to install for the LLaMA 4-bit mode under the wiki but i'm getting this error
Loading llama-7b-hf... Loading model ... Done. Traceback (most recent call last):
File "/home/user/Desktop/oobabooga/text-generation-webui/server.py", line 194, in
I put the llama-7b-4bit.pt in the models folder next too the llama-7b-hf folder.
Please let me know any other info you need.
I am on linux and i have a GTX 1660 Ti
Install the requirements from the Facebook Package:
# If you do not do this, you will get runtime errors about the LLaMATokenizer not being registered.
git clone https://github.com/facebookresearch/llama
pip install -r ./llama/requirements.txt
Try uninstalling your existing transformers and installing again and see if that solves it:
pip uninstall transformers
pip install git+https://github.com/zphang/transformers@llama_push
@jimtendo we use the Hugging Face adaptation for the LLaMA model which is independent from the original Facebook implementation.
@oobabooga unfortunately it still doesnt work, i'm getting the same traceback
Try uninstalling your existing transformers and installing again and see if that solves it:
pip uninstall transformers pip install git+https://github.com/zphang/transformers@llama_push
It seems to be fixed after rerunning the the pip install requirements.txt again after the latest commit
i just put it in my start-webui.sh file temporarily to handle all the conda pathing. this is what it mine looked like.
INSTALL_ENV_DIR="$(pwd)/installer_files/env"
export PATH="$INSTALL_ENV_DIR/bin:$PATH"
CONDA_BASEPATH=$(conda info --base)
source "$CONDA_BASEPATH/etc/profile.d/conda.sh" # otherwise conda complains about 'shell not initialized' (needed when running in a script)
conda activate
cd text-generation-webui
pip install -r "requirements.txt"
#python server.py --load-in-4bit --model llama-7b-hf
and then once it was done i deleted the pip install -r "requirements.txt" and uncommented the python server.py --load-in-4bit --model llama-7b-hf and now its working perfectly!
I followed this guide: https://github.com/underlines/awesome-marketing-datascience/blob/master/llama.md#windows-11-native
Unfortunately, I still get this very error, even after re-doing this thrice now.
TL;DR:
- Install Conda, create environment
textgen - Install all the basic packages into the env
- install facebookresearch/llama/requirements.txt
- install GPTQ-for-LLaMa via the provided
.whl - un- and reinstall via
zphang/transformers@llama_push
And this is what I get:
(textgen) PS C:\tools\text-generation-webui> python server.py --model llama-13b-hf --load-in-4bit
Warning: --load-in-4bit is deprecated and will be removed. Use --gptq-bits 4 instead.
Loading llama-13b-hf...
CUDA extension not installed.
Loading model ...
Done.
Traceback (most recent call last):
File "C:\tools\text-generation-webui\server.py", line 236, in <module> shared.model, shared.tokenizer = load_model(shared.model_name) File "C:\tools\text-generation-webui\modules\models.py", line 163, in load_model tokenizer = AutoTokenizer.from_pretrained(Path(f"models/{shared.model_name}/")) File "C:\Users\Ingwie Phoenix\miniconda3\envs\textgen\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 676, in from_pretrained
ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.
Any other idea how I can fix this? Thanks!
(EDIT: Sorry for the mobid formatting, miniconda's chosen TTY emulator sucks x.x)
@IngwiePhoenix you seem to have two problems there, missing cuda and the LLaMATokenizer error.
I found this guide to be really helpful : https://www.reddit.com/r/LocalLLaMA/comments/11o6o3f/how_to_install_llama_8bit_and_4bit/?utm_source=share&utm_medium=web2x&context=3
Including a comment straight after concerning LLaMATokenizer which I needed to apply. This helped me to get things working with a 4bit model.
@IngwiePhoenix @codermrrob @Programer2947693
try to modiffy modles/llama-7b-hf/tokenizer_config.json
LLaMATokenizer -> LlamaTokenizer