qlora
qlora copied to clipboard
RecursionError: maximum recursion depth exceeded
I am getting maximum recursion depth error after running this following command: python qlora.py --model_name_or_path decapoda-research/llama-7b-hf
And this is the error I got:
File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc return self.unk_token_id File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id return self.convert_tokens_to_ids(self.unk_token) File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids return self._convert_token_to_id_with_added_voc(tokens) File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc return self.unk_token_id File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id return self.convert_tokens_to_ids(self.unk_token) File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids return self._convert_token_to_id_with_added_voc(tokens) File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc return self.unk_token_id File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id return self.convert_tokens_to_ids(self.unk_token) RecursionError: maximum recursion depth exceeded
Normally I was getting OverflowError but I followed this PR and it resolved but now I got this error.
Normally I was getting OverflowError but I followed this PR and it resolved but now I got this error.
I see the same thing, I changed LlamaTokenizerFast back to LlamaTokenizer.
Now I have another issue it dumps core when it is cleaning up something.
seems to be caused by the included, old tokenizer in decapoda-research/llama-7b-hf, see detail here: https://github.com/huggingface/transformers/issues/22762. Was able to resolve the issue here by switching to huggyllama/llama-7b, which has newer, correct tokenizer.
Forcing the unk_token fixed this for me (v4.30.1):
tokenizer = tokenizer_class.from_pretrained(model_name_or_path, unk_token="<unk>")
use_fast=False,
can solve this issue.
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
Best regards,
Shuyue July 6th, 2024