transformers icon indicating copy to clipboard operation
transformers copied to clipboard

LayoutLM.from_pretrained doesn't load embeddings' weights when using safetensors

Open mszulc913 opened this issue 10 months ago • 5 comments

System Info

  • transformers version: 4.38.1
  • Platform: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
  • Python version: 3.10.13
  • Huggingface_hub version: 0.20.3
  • Safetensors version: 0.4.2
  • Accelerate version: not installed
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed

Who can help?

No response

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

Running the following:

from transformers import LayoutLMModel

model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=True)

results in:

Some weights of LayoutLMModel were not initialized from the model checkpoint at microsoft/layoutlm-base-uncased and are newly initialized: ['layoutlm.embeddings.word_embeddings.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Note that this is also the default behavior if a user has safetensors installed and doesn't provide use_safetensors.

The following works as expected (without safetensors):

from transformers import LayoutLMModel

model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=False)

Expected behavior

Embeddings' weights should be correctly loaded.

mszulc913 avatar Apr 08 '24 13:04 mszulc913

Hi @mszulc913, thanks for opening this issue!

I'm able to replicate the issue.

The model checkpoint didn't have a safetensors weight associated with it. Which is merged in with this commit.

However, the issue still persists :(

It seems like this is an issue when loading as safetensors on the fly.

If instead I save out the model locally from pytorch.bin and save out as safetensors, I'm able to load without any issue:

from transformers import LayoutLMModel

model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=False)

model.save_pretrained("test-layoutlm-base-uncased") # Saves out the model as safetensors

# Loads from safetensors automatically
model = LayoutLMModel.from_pretrained("test-layoutlm-base-uncased")

cc @LysandreJik @Narsil As you both probably have the best knowledge of this code
cc @Rocketknight1 as you've been looking into the safetensors conversion recently

amyeroberts avatar Apr 10 '24 14:04 amyeroberts

In SFconvertbot's convert.py file, the loading of weights happens with -

loaded = torch.load(pt_filename, map_location="cpu", weights_only=True)

, which does not maps the layers correctly (the 'keys' in the weights dictionary are different). This is causing the issue.

If we load the weights using -

    from transformers import LayoutLMModel
    model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=False)
    loaded = {f"layoutlm.{k}":v.data for k, v in model.named_parameters()}

, then the weights are loaded with correct mappings.

This is either Pytorch's issue with load() function, or the implementation issue with SFconvertbot. If this issue needs to be fixed somewhere, I can take it up.

Also, I created a PR in microsoft/layoutlm-base-uncased with updated SafeTensors.

RVV-karma avatar Apr 13 '24 08:04 RVV-karma

@RVV-karma Thanks for looking into this and for fixing the weights upstream ❤️

@Rocketknight1 has been working with safetensor weight loading and the bot recently, so will be able to advise on the best approach here to address for future models.

amyeroberts avatar Apr 16 '24 12:04 amyeroberts

I actually haven't touched the bot, so I'm not sure how to push a fix to it! @Narsil do you know where it runs?

Rocketknight1 avatar Apr 16 '24 17:04 Rocketknight1

The bot runs here @Rocketknight1 if you want to open a PR: https://huggingface.co/spaces/safetensors/convert

code is here https://huggingface.co/spaces/safetensors/convert/blob/main/convert.py

LysandreJik avatar Jun 17 '24 08:06 LysandreJik

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Oct 11 '24 08:10 github-actions[bot]