unsloth icon indicating copy to clipboard operation
unsloth copied to clipboard

[FIXED] Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3

Open djannot opened this issue 1 year ago • 29 comments

I get this error:

Traceback (most recent call last):
  File "/home/denis/Documents/ai/unsloth/llama3-chat-template.py", line 20, in <module>
    model, tokenizer = FastLanguageModel.from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/models/loader.py", line 323, in from_pretrained
    model, tokenizer = dispatch_model.from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/models/llama.py", line 1610, in from_pretrained
    tokenizer = load_correct_tokenizer(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/tokenizer_utils.py", line 538, in load_correct_tokenizer
    tokenizer = _load_correct_tokenizer(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/tokenizer_utils.py", line 496, in _load_correct_tokenizer
    fast_tokenizer = AutoTokenizer.from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 897, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2271, in from_pretrained
    return cls._from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2505, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 115, in __init__
    fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3

It works with nsloth/Llama-3.2-1B-Instruct-bnb-4bit

djannot avatar Sep 25 '24 20:09 djannot

the same issue as mine.

KaiDF avatar Sep 26 '24 02:09 KaiDF

Oh it's best to update transformers via pip install --upgrade "transformers>=4.45"

danielhanchen avatar Sep 26 '24 06:09 danielhanchen

Thanks @danielhanchen for the fast response (as usual).

I did try this, but I now get another error:

Traceback (most recent call last):
  File "/home/denis/Documents/ai/unsloth/llama3-chat-template.py", line 113, in <module>
    trainer_stats = trainer.train()
  File "<string>", line 145, in train
  File "<string>", line 358, in _fast_inner_training_loop
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/trainer.py", line 3477, in training_step
    self.optimizer.train()
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/optimizer.py", line 128, in train
    return self.optimizer.train()
AttributeError: 'AdamW' object has no attribute 'train'

djannot avatar Sep 26 '24 06:09 djannot

Ok that's a weird error - are you using the notebooks we provided without any changes? It's possible HuggingFace's new update might have broken some parts

danielhanchen avatar Sep 26 '24 06:09 danielhanchen

Yes, but I've just tried creating a new conda env and in that case it works.

So there was probably something weird going on with the upgrades of the different packages. Even if I still don't understand why it was working with the 1B model.

Anyway, you can close the issue. And thanks again for the replies.

djannot avatar Sep 26 '24 07:09 djannot

Yes, but I've just tried creating a new conda env and in that case it works.

This worked for me too 😸

sais-github avatar Sep 26 '24 08:09 sais-github

but when inference, it occurs that 'ValueError: Invalid cache_implementation (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']'

KaiDF avatar Sep 26 '24 09:09 KaiDF

but when inference, it occurs that 'ValueError: Invalid cache_implementation (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']'

this error has been fixed by upgrading the unsloth to version 2024.9.post3 and transformers to version 4.45.0

KaiDF avatar Sep 26 '24 10:09 KaiDF

Thanks @danielhanchen for the fast response (as usual).

I did try this, but I now get another error:

Traceback (most recent call last):
  File "/home/denis/Documents/ai/unsloth/llama3-chat-template.py", line 113, in <module>
    trainer_stats = trainer.train()
  File "<string>", line 145, in train
  File "<string>", line 358, in _fast_inner_training_loop
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/trainer.py", line 3477, in training_step
    self.optimizer.train()
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/optimizer.py", line 128, in train
    return self.optimizer.train()
AttributeError: 'AdamW' object has no attribute 'train'

Upgrading accelerate to version 0.34.0 will resolve this issue.

mf-skjung avatar Sep 26 '24 17:09 mf-skjung

I'm running into this same error when trying to quantize the trained models into gguf format Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3 Edit: The tokenizer unsloth exports is broken.

sais-github avatar Sep 27 '24 09:09 sais-github

I am running into this same error as well when merging and exporting the 16bit model and using it on vllm. I have tried multiple models and the error is consistent. Most definitely the tokenizer exporter is broken Edit: by using latest version of docker image form vllm it now works (v0.6.2)

selectorseb avatar Sep 27 '24 17:09 selectorseb

I encountered the same issue as @selectorseb while deploying a finetuned Llama-3.2 model using vLLM with Docker. Initially, I faced the same problem mentioned in the original post @djannot, but after updating the vLLM Docker image, the issue was resolved.

lianghsun avatar Sep 30 '24 00:09 lianghsun

@KaiDF Apologies forgot to mention for you to update Unsloth!! Glad it works now! Sorry on the issue!

@mf-skjung I'll actually edit pyproject.toml to log this - thanks!

On the rest of the issues - so the solution seems to update vllm>=0.6.2? Ie pip install --upgrade "vllm>=0.6.2"

danielhanchen avatar Sep 30 '24 09:09 danielhanchen

I am running a notebook on google collaborator and still have this issue. I am trying to read a checkpoint from a LLAMA model fine tuned with LoRa. Yesterday, it worked fine, but today that changed. image If I update to transformers 4.45 I receive another error. (invalid repository id)

riddle-today avatar Oct 01 '24 16:10 riddle-today

@riddle-today Apologies sorry - can you screenshot the error - the picture you provided is just a warning - you can ignore that!

danielhanchen avatar Oct 02 '24 01:10 danielhanchen

image It is the same error as @djannot . The picture before was to show the version of transformers, unsloth and xformers I am using. Thank you so much for the prompt answer @danielhanchen .

riddle-today avatar Oct 02 '24 06:10 riddle-today

If I go and download the tokenizer files from the HuggingFace repository and replace them, it works.

riddle-today avatar Oct 02 '24 13:10 riddle-today

Updating tokenizers to latest 0.20.0 might help

teamclouday avatar Oct 03 '24 00:10 teamclouday

@teamclouday Oh wait try not to update it to 0.20!! Transformers will error out!!

@riddle-today Oh yep apologies I forgot to mention you have to override the tokenizer with the latest one I uploaded!

danielhanchen avatar Oct 03 '24 08:10 danielhanchen

If I go and download the tokenizer files from the HuggingFace repository and replace them, it works.

This resolves Exception: data did not match any variant of untagged enum ModelWrapper ... for me, too! It seems like some saving error?

tongyx361 avatar Oct 10 '24 14:10 tongyx361

@tongyx361 Apologies on the delay - ye the new transformers update broke saving - so you need overwrite the old tokenizer file up redownloading them

danielhanchen avatar Oct 23 '24 19:10 danielhanchen

Can somebody list down the steps to override the tokenizer file. I am new to this. Need Help!

srsugandh avatar Oct 24 '24 16:10 srsugandh

Can somebody list down the steps to override the tokenizer file. I am new to this. Need Help!

Form my understand is

  1. download tokenizer file from original repo
  2. and replace/upload to yours.

But i still stuck with other issue after that so can't confirm.

katopz avatar Oct 25 '24 02:10 katopz

I am still facing this issue, I have the latest "2024.10.7" version but unsloth requires transformers < 4.45, but it is not working when I take transformers < 4.45 getting same error

srsugandh avatar Oct 25 '24 11:10 srsugandh

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

danielhanchen avatar Oct 27 '24 09:10 danielhanchen

I am still facing this issue, I have the latest "2024.10.7" version but unsloth requires transformers < 4.45, but it is not working when I take transformers < 4.45 getting same error

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

@katopz @danielhanchen @srsugandh - same problem here. Unsloth requires transformers < 4.45, but that doesn't work. So should we manually install a higher version of transformers to fix this issue?

ai-nikolai avatar Nov 05 '24 01:11 ai-nikolai

Notebook with a working version:

@danielhanchen @katopz - here is a notebook for "offline" installation on Kaggle: (https://www.kaggle.com/code/kolyan1/offline-unsloth-package-installation-pt-2-working)

Generally one work-around is as follows:

  1. Install unsloth==2024.10.4 and torch==2.4.1
pip3 install unsloth==2024.10.4 torch==2.4.1
  1. Install transformers==4.45.2 (this will throw errors, as transformers has to be <4.45 for unsloth, but it still installs successfully and works)
pip3 install transformers==4.45.2

ai-nikolai avatar Nov 05 '24 02:11 ai-nikolai

I am still facing this issue, I have the latest "2024.10.7" version but unsloth requires transformers < 4.45, but it is not working when I take transformers < 4.45 getting same error

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

@katopz @danielhanchen @srsugandh - same problem here. Unsloth requires transformers < 4.45, but that doesn't work. So should we manually install a higher version of transformers to fix this issue?

I found a work around. I did pip install to get the latest version of unsloth then uninstalled it and then used the github commit to install the unsloth (pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"). This is because just using the commit does not install the related libraries and then I installed the transformer with version 4.45.1 and it worked.

srsugandh avatar Nov 06 '24 05:11 srsugandh

This is broken for me. You can try it with: python3 unsloth-cli-v2.py
--model_name "unsloth/Llama-3.3-70B-Instruct-bnb-4bit"
--load_in_4bit
--dataset "./qa_Wodehouse_unsloth_conversion.jsonl"
--output_dir "./wodehouse_finetune_output"
--per_device_train_batch_size 2
--gradient_accumulation_steps 4
--learning_rate 2e-4
--max_steps 400
--save_model
--save_gguf
--quantization "q4_k_m"

It will fail with: Traceback (most recent call last): File "/workspace/unsloth-cli-v2.py", line 235, in run(args) File "/workspace/unsloth-cli-v2.py", line 48, in run model, tokenizer = FastLanguageModel.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/unsloth/models/loader.py", line 301, in from_pretrained model, tokenizer = dispatch_model.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/unsloth/models/llama.py", line 1598, in from_pretrained tokenizer = load_correct_tokenizer( ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/unsloth/tokenizer_utils.py", line 538, in load_correct_tokenizer tokenizer = _load_correct_tokenizer( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/unsloth/tokenizer_utils.py", line 496, in _load_correct_tokenizer fast_tokenizer = AutoTokenizer.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 897, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2271, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2505, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_fast.py", line 115, in init fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)

IridiumMaster avatar Feb 15 '25 20:02 IridiumMaster

@danielhanchen hey guys was this resolved in the end?

I am curious what are the currently recommended versions of installation, specifically for:

pip3 freeze | grep unsloth
pip3 freeze | grep torch
pip3 freeze | grep transformers
pip3 freeze | grep bitsandbytes
pip3 freeze | grep accelerate
pip3 freeze | grep trl
pip3 freeze | grep tokenizers
pip3 freeze | grep datasets
pip3 freeze | grep pandas
pip3 freeze | grep numpy
pip3 freeze | grep vllm

Can somebody post the above for a working version.

ai-nikolai avatar Mar 11 '25 05:03 ai-nikolai