diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

StableDiffusionPipeline.from_single_file() with CKPT fails: "UnpicklingError: Weights only load failed"

Open whydna opened this issue 1 year ago • 12 comments

Describe the bug

Trying to load local CKPT file using the "from_single_file()" method fails. Works fine with .safetensors file from same repo (Runway ML SD).

Reproduction

from diffusers import StableDiffusionPipeline
device = 'cpu'
pipe = StableDiffusionPipeline.from_single_file("models/checkpoints/v1-5-pruned-emaonly.ckpt")
pipe = pipe.to(device)

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0] 

Logs

UnpicklingError                           Traceback (most recent call last)
File ~/Development/ai-flow-backend/venv/lib/python3.12/site-packages/diffusers/models/model_loading_utils.py:108, in load_state_dict(checkpoint_file, variant)
    107         weights_only_kwarg = {"weights_only": True} if is_torch_version(">=", "1.13") else {}
--> 108         return torch.load(
    109             checkpoint_file,
    110             map_location="cpu",
    111             **weights_only_kwarg,
    112         )
    113 except Exception as e:

File ~/Development/ai-flow-backend/venv/lib/python3.12/site-packages/torch/serialization.py:1024, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
   1023     except RuntimeError as e:
-> 1024         raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
   1025 return _load(opened_zipfile,
   1026              map_location,
   1027              pickle_module,
   1028              overall_storage=overall_storage,
   1029              **pickle_load_args)

UnpicklingError: Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution.Do it only if you get the file from a trusted source. WeightsUnpickler error: Unsupported class pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint

During handling of the above exception, another exception occurred:

UnicodeDecodeError                        Traceback (most recent call last)
...
--> 128     raise OSError(
    129         f"Unable to load weights from checkpoint file for '{checkpoint_file}' " f"at '{checkpoint_file}'. "
    130     )

OSError: Unable to load weights from checkpoint file for 'models/checkpoints/v1-5-pruned-emaonly.ckpt' at 'models/checkpoints/v1-5-pruned-emaonly.ckpt'.

System Info

  • 🤗 Diffusers version: 0.29.2
  • Platform: macOS-14.4.1-arm64-arm-64bit
  • Running on a notebook?: No
  • Running on Google Colab?: No
  • Python version: 3.12.4
  • PyTorch version (GPU?): 2.3.1 (False)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.23.4
  • Transformers version: 4.42.4
  • Accelerate version: 0.32.1
  • PEFT version: not installed
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.3
  • xFormers version: not installed
  • Accelerator: Apple M1 Pro
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@yiyixuxu @asomoza @DN

whydna avatar Jul 16 '24 16:07 whydna

It says you can change weights_only to False at diffusers/models/model_loading_utils.py:107; but be careful because it is not a .safetensors file, it might run malicious codes. If the .safetensors equivalent works fine; then I would prefer it. For more info about safetensors.

tolgacangoz avatar Jul 16 '24 16:07 tolgacangoz

doesn't look like there is a way to set that from the pipeline class.

whydna avatar Jul 16 '24 17:07 whydna

it's so rare nowadays that someone wants to load a ckpt on purpose that probably no one noticed this.

This is a safety feature in the code because a generative model shouldn't need to execute arbitrary code with diffusers:

https://github.com/huggingface/diffusers/blob/3b37fefee99425286984a9d5fa4f1850064d01eb/src/diffusers/models/model_loading_utils.py#L107

If you still want to load it, you can change it to False in the code but we probably won't do it as a library because of the risk. You can also downgrade your torch version.

I did it as a test and now it's asking my to install torch lighting which is weird, never loaded a ckpt with diffusers before so I don't know if that was a requirement or not.

cc: @DN6 for awareness

asomoza avatar Jul 16 '24 18:07 asomoza

I think the bug is the fact that it's not able to load the weights using weights_only=True from the .ckpt checkpoint (https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned.ckpt).

If weights_only is set to False then it will try to load the model modules as well, which depending on the implementation may require pytorch lightning.

The expected behavior for the ckpt should just be the same as loading the .safetensors - which seems to be working fine.

whydna avatar Jul 16 '24 18:07 whydna

If you consider that a bug then you probably should report it in the model repo, probably it was saved with some weird call to pytorch lighting because of the training.

which depending on the implementation may require pytorch lightning

The safetensors alternative works because it doesn't store any library requirement/implementation and it's a fact that you don't need to install pytorch lighting to load any stable diffusion checkpoints.

But yeah, you're right. You can easily reproduce this error with this code:

import torch

state_dict = torch.load("models/checkpoints/v1-5-pruned-emaonly.ckpt", weights_only=True)

but this is not a diffusers issue then.

asomoza avatar Jul 16 '24 18:07 asomoza

It's a change they made. 0.27.2 this worked

pipe.vae = AutoencoderKL.from_single_file(
    path.join("./assets/vae/orangemix.vae.pt"),
    local_files_only=True,
    torch_dtype=torch.bfloat16
)

But now in 0.28.0+ it is broken. Can we stop making changes that completely breaks functionality without noting it? @sayakpaul many of us still use pickle files.

Also how can I make sure all weights are loaded? Some weights of the model checkpoint were not used when initializing CLIPTextModel I'm now receiving this message. It's annoying.

JemiloII avatar Jul 18 '24 20:07 JemiloII

we started to add weights_only=True for safety concern https://github.com/huggingface/diffusers/pull/7393

I wonder if we would allow the user to explicitly set weights_only=False when they need it cc @DN6

yiyixuxu avatar Jul 19 '24 00:07 yiyixuxu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 14 '24 15:09 github-actions[bot]

So will there be any action on this?

JemiloII avatar Sep 23 '24 18:09 JemiloII

Yes, there will be more considerations once Dhruv is back from the leave.

sayakpaul avatar Sep 24 '24 04:09 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Oct 18 '24 15:10 github-actions[bot]

Cc: @DN6

sayakpaul avatar Oct 18 '24 15:10 sayakpaul

@DN6 Have you had a chance to take a look at this?

JemiloII avatar Oct 29 '24 08:10 JemiloII

@JemiloII the core issue is similar to what I was talking about here: https://github.com/huggingface/diffusers/issues/9154.

There are non-weight serialised objects in the checkpoint file that we don't allow loading via torch.load. It's a security hole to have this in the library. The recommended approach is to convert the ckpt file to safetensors, or if you must use a ckpt file format, to remove the objects that have been serialized in the file along with the weights.

DN6 avatar Oct 31 '24 08:10 DN6

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Nov 24 '24 15:11 github-actions[bot]

Marking as closed due to:

  • this being a security hole to have in the library, like Dhruv said
  • weights_only=True is the official pytorch recommendation

To convert the weights, one can simply load the state dict with pytorch unsafe code (weights_only=False), and convert to safetensors following https://huggingface.co/docs/safetensors/en/index#save-tensors - this is highly recommended for safety reasons. If there's anything else we can help with, please let us know 🤗

a-r-r-o-w avatar Jan 27 '25 01:01 a-r-r-o-w