qlora
qlora copied to clipboard
ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.
I followed the instructions.
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.8/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 5.2
CUDA SETUP: Detected CUDA version 118
/home/developer/mambaforge/envs/Guanaco/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
warn(msg)
CUDA SETUP: Loading binary /home/developer/mambaforge/envs/Guanaco/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118_nocublaslt.so...
loading base model decapoda-research/llama-7b-hf...
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:51<00:00, 1.55s/it]
/home/developer/mambaforge/envs/Guanaco/lib/python3.10/site-packages/peft/utils/other.py:76: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead.
warnings.warn(
adding LoRA modules...
trainable params: 159907840 || all params: 6898323456 || trainable: 2.3180681656919973
loaded model
Traceback (most recent call last):
File "/home/developer/qlora/qlora.py", line 763, in
There seems to be a discrepancy in spelling LLaMATokenizer.
The one in transformers is LlaMATokenizer, the second ell is l, not L. Also MA vs. ma.
So, I had to go into ~/.cache/huggingface/hub/.... and fix the name in tokenizer_config.json