simple-llm-finetuner `LLaMATokenizer` vs `LlamaTokenizer` class names

`LLaMATokenizer` vs `LlamaTokenizer` class names

Open vadi2 opened this issue 1 year ago • 5 comments

Running inference gives the following warning:

Loading tokenizer...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.

Is it a problem?

Mar 25 '23 11:03 vadi2

No this happens because the decapoda-research version of the llama model on huggingface used a pre-merge version of transformers where the capitalization was different. That's why you can't use AutoModel with llama

Mar 31 '23 03:03 lxe

Still happening and seems to lead to crash :(

Apr 07 '23 10:04 Gitterman69

I was wasn't getting a crash due to this, it was just a warning.

Apr 07 '23 10:04 vadi2

I was wasn't getting a crash due to this, it was just a warning.

i tried to use another llama7b_hf model but the same happens.... unfortunately its impossible to find out why or how it crashed.... verbose is not possible, is it?

Apr 07 '23 10:04 Gitterman69

Solution is to change the word LLaMATokenizer to LlamaTokenizer in the directory ./cache/huggingface/hub and all the subfiles (only in two if I remember well)

May 20 '23 18:05 mums77

simple-llm-finetuner simple-llm-finetuner copied to clipboard

`LLaMATokenizer` vs `LlamaTokenizer` class names

simple-llm-finetuner
simple-llm-finetuner copied to clipboard