accelerate save_state removes shared weights but load

save_state removes shared weights but load_state cannot load properly

Open MiladInk opened this issue 1 year ago • 4 comments

trafficstars

System Info

accelerate version: 0.27.2
python: 3.11

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
[X] My own task or dataset (give details below)

Reproduction

I am saving the state_dict of an 'facebook/opt-125m' model. In this model the weights are shared between embedding tokens and the language modelling head. When I am saving the state dictionary of the model, I see this warning:

WARNING: Removed shared tensor {'pretrained_model.lm_head.weight'} while saving. This should be OK, but check by verifying that you don't receive any warning while reloading

The problem is when I want to load_state the same object, there is an error that:

Missing key(s) in state_dict: "pretrained_model.lm_head.weight".

I do understand that because the weights are shared they are removed, but I don't understand how can I work with models which have shared weights then?

Interesting thing is, the code was working with the previous versions of the libraries. Unfortunately, I don't have the old environment to tell you exactly where things break.

Thanks in advance.

Expected behavior

I expected the save_state and load_state to be able to restore the original model no matter what. This does not work.

May 27 '24 18:05 MiladInk

cc @SunMarc

May 27 '24 19:05 muellerzr

Hi @MiladInk, thanks for the report. Could you share a minimal reproducer ? When we load a model with shared weights, we make sure to tie the shared weights together.

May 28 '24 13:05 SunMarc

I am facing a similar issue when trying to load and save “google/gemma-2b”

Jun 18 '24 14:06 raghavgarg97

Hi @raghavgarg97, could you share a minimal reproducer ?

Jun 19 '24 14:06 SunMarc

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jul 13 '24 15:07 github-actions[bot]

accelerate accelerate copied to clipboard

save_state removes shared weights but load_state cannot load properly

System Info

Information

Tasks

Reproduction

Expected behavior

accelerate
accelerate copied to clipboard