diffusers NotImplementedError: Cannot copy out of meta tensor; no data! coz, Some weights of the model checkpoint were not used when initializing CLIPTextModelWithProjection

Describe the bug

I try to load a .safetensors file and save it as diffusers type model and I got Some weights of the model checkpoint were not used when initializing CLIPTextModelWithProjection: ['text_model.embeddings.position_ids'] warning

import torch
from diffusers import StableDiffusionXLPipeline
from diffusers import DPMSolverMultistepScheduler

model_params = {
    "pretrained_model_link_or_path": "/workspace/playground-v2.5-1024px-aesthetic.fp16.safetensors",
    "torch_dtype": torch.float16,
}

pipe = StableDiffusionXLPipeline.from_single_file(**model_params)

pipe.save_pretrained(
    save_directory="/workspace/playground-v2.5.fp16",
    safe_serialization=True,
    variant="fp16",
    push_to_hub=False,
)

When I try to load it, I got NotImplementedError: Cannot copy out of meta tensor; no data! error

pipe = StableDiffusionXLPipeline.from_pretrained("/workspace/playground-v2.5.fp16", torch_dtype=torch.float16, variant="fp16")
pipe = pipe.to("cuda")

Reproduction

import torch
from diffusers import StableDiffusionXLPipeline
from diffusers import DPMSolverMultistepScheduler

model_params = {
    "pretrained_model_link_or_path": "/workspace/playground-v2.5-1024px-aesthetic.fp16.safetensors",
    "torch_dtype": torch.float16,
}

pipe = StableDiffusionXLPipeline.from_single_file(**model_params)

pipe.save_pretrained(
    save_directory="/workspace/playground-v2.5.fp16",
    safe_serialization=True,
    variant="fp16",
    push_to_hub=False,
)

pipe = StableDiffusionXLPipeline.from_pretrained("/workspace/playground-v2.5.fp16", torch_dtype=torch.float16, variant="fp16")
pipe = pipe.to("cuda")

Logs

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[11], line 2
      1 pipe = StableDiffusionXLPipeline.from_pretrained("/workspace/playground-v2.5.fp16", torch_dtype=torch.float16, variant="fp16")
----> 2 pipe = pipe.to("cuda")
      3 pipe.safety_checker = None
      4 # pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
      5 
      6 #3

File /usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py:418, in DiffusionPipeline.to(self, *args, **kwargs)
    414     logger.warning(
    415         f"The module '{module.__class__.__name__}' has been loaded in 8bit and moving it to {dtype} via `.to()` is not yet supported. Module is still on {module.device}."
    416     )
    417 else:
--> 418     module.to(device, dtype)
    420 if (
    421     module.dtype == torch.float16
    422     and str(device) in ["cpu"]
    423     and not silence_dtype_warnings
    424     and not is_offloaded
    425 ):
    426     logger.warning(
    427         "Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It"
    428         " is not recommended to move them to `cpu` as running them will fail. Please make"
   (...)
    431         " `torch_dtype=torch.float16` argument, or use another device for inference."
    432     )

File /usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:2576, in PreTrainedModel.to(self, *args, **kwargs)
   2571     if dtype_present_in_args:
   2572         raise ValueError(
   2573             "You cannot cast a GPTQ model in a new `dtype`. Make sure to load the model using `from_pretrained` using the desired"
   2574             " `dtype` by passing the correct `torch_dtype` argument."
   2575         )
-> 2576 return super().to(*args, **kwargs)

File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1145, in Module.to(self, *args, **kwargs)
   1141         return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
   1142                     non_blocking, memory_format=convert_to_format)
   1143     return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
-> 1145 return self._apply(convert)

File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
    795 def _apply(self, fn):
    796     for module in self.children():
--> 797         module._apply(fn)
    799     def compute_should_use_set_data(tensor, tensor_applied):
    800         if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
    801             # If the new tensor has compatible tensor type as the existing tensor,
    802             # the current behavior is to change the tensor in-place using `.data =`,
   (...)
    807             # global flag to let the user control whether they want the future
    808             # behavior of overwriting the existing tensor or not.

File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
    795 def _apply(self, fn):
    796     for module in self.children():
--> 797         module._apply(fn)
    799     def compute_should_use_set_data(tensor, tensor_applied):
    800         if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
    801             # If the new tensor has compatible tensor type as the existing tensor,
    802             # the current behavior is to change the tensor in-place using `.data =`,
   (...)
    807             # global flag to let the user control whether they want the future
    808             # behavior of overwriting the existing tensor or not.

    [... skipping similar frames: Module._apply at line 797 (3 times)]

File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
    795 def _apply(self, fn):
    796     for module in self.children():
--> 797         module._apply(fn)
    799     def compute_should_use_set_data(tensor, tensor_applied):
    800         if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
    801             # If the new tensor has compatible tensor type as the existing tensor,
    802             # the current behavior is to change the tensor in-place using `.data =`,
   (...)
    807             # global flag to let the user control whether they want the future
    808             # behavior of overwriting the existing tensor or not.

File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:820, in Module._apply(self, fn)
    816 # Tensors stored in modules are graph leaves, and we don't want to
    817 # track autograd history of `param_applied`, so we have to use
    818 # `with torch.no_grad():`
    819 with torch.no_grad():
--> 820     param_applied = fn(param)
    821 should_use_set_data = compute_should_use_set_data(param, param_applied)
    822 if should_use_set_data:

File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1143, in Module.to.<locals>.convert(t)
   1140 if convert_to_format is not None and t.dim() in (4, 5):
   1141     return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
   1142                 non_blocking, memory_format=convert_to_format)
-> 1143 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

NotImplementedError: Cannot copy out of meta tensor; no data!

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

diffusers version: 0.27.0
Platform: Linux-5.15.0-79-generic-x86_64-with-glibc2.35
Python version: 3.10.12
PyTorch version (GPU?): 2.0.1+cu118 (True)
Huggingface_hub version: 0.22.1
Transformers version: 4.39.1
Accelerate version: 0.28.0
xFormers version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

@yiyixuxu @sayakpaul @DN6

Mar 28 '24 13:03 shabri-arrahim

How was pipe initialized?

Mar 28 '24 13:03 sayakpaul

How was pipe initialized?

pipe = StableDiffusionXLPipeline.from_single_file(**model_params) @sayakpaul

Mar 28 '24 13:03 shabri-arrahim

Where "playground-v2.5-1024px-aesthetic.fp16.safetensors" is coming from?

Mar 28 '24 13:03 sayakpaul

Where "playground-v2.5-1024px-aesthetic.fp16.safetensors" is coming from?

I download it via this link: https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic/resolve/main/playground-v2.5-1024px-aesthetic.fp16.safetensors

Mar 28 '24 13:03 shabri-arrahim

Cc: @DN6 could you take a look?

Mar 28 '24 13:03 sayakpaul

I still didn't know why this happen, but at least I know that the text_endocer_2 are failed to be loaded. So if someone experiencing a similar issue like me, you might find this solution helpful (although I must admit it’s not the most efficient approach 🙃).

import torch
from diffusers import StableDiffusionXLPipeline
from diffusers import DPMSolverMultistepScheduler
from transformers import CLIPTextModelWithProjectio

from safetensors.torch import load_file as safe_load
from diffusers.pipelines.stable_diffusion.convert_from_ckpt import convert_open_clip_checkpoint

checkpoint = safe_load("/workspace/playground-v2.5-1024px-aesthetic.fp16.safetensors", device="cpu")

# NOTE: this while loop isn't great but this controlnet checkpoint has one additional
# "state_dict" key https://huggingface.co/thibaud/controlnet-canny-sd21
while "state_dict" in checkpoint:
    checkpoint = checkpoint["state_dict"] 

config_name = "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k"
config_kwargs = {"projection_dim": 1280}
text_encoder_2 = convert_open_clip_checkpoint(
    checkpoint,
    config_name,
    prefix="conditioner.embedders.1.model.",
    has_projection=True,
    local_files_only=False,
    **config_kwargs,
)

model_params = {
    "pretrained_model_link_or_path": "/workspace/playground-v2.5-1024px-aesthetic.fp16.safetensors",
    "torch_dtype": torch.float16,
    "use_safetensors": True,
    "add_watermarker": False,
}

pipe = StableDiffusionXLPipeline.from_single_file(**model_params)
pipe.text_encoder_2 = text_encoder_2

pipe.save_pretrained(
    save_directory="/workspace/playground-v2.5.fp16",
    safe_serialization=True,
    variant="fp16",
    push_to_hub=False,
)

Mar 28 '24 19:03 shabri-arrahim

Cc: @DN6 could you take a look?

Mar 29 '24 02:03 sayakpaul

Hmm strange. Some tensors are not being saved in the OpenCLIP model when calling save_pretrained. Taking a look.

Apr 02 '24 07:04 DN6

Hi @shabri-arrahim tracked the issue down to these lines https://github.com/huggingface/diffusers/blob/2b04ec2ff7270d2044410378b04d85a194fa3d4a/src/diffusers/loaders/single_file_utils.py#L1238-L1240

When accelerate is installed and saving to safetensors, we attempt to save those weights as shared tensors (which the safetensor format currently doesn't support) and so they are omitted and saved as meta tensors, which leads to the error when you try loading the model.

I'll include a fix for this is in the #7496

Apr 02 '24 12:04 DN6

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 27 '24 15:04 github-actions[bot]

hello i use everydream2 train a model, but i can't load trained model

before convert

Please load the component before passing it in as an argument to from_single_file.

text_encoder = CLIPTextModel.from_pretrained('...') pipe = StableDiffusionControlNetPipeline.from_single_file(, text_encoder=text_encoder)

after convert

convert doesn't work too

NotImplementedError: Cannot copy out of meta tensor; no data!

May 17 '24 05:05 crapthings

@crapthings could you create a separate issue please with a reproducible code example (no screenshots). Not sure if your problem is related.

May 21 '24 11:05 DN6

@crapthings could you create a separate issue please with a reproducible code example (no screenshots). Not sure if your problem is related.

revert to 0.27.2 works

May 22 '24 07:05 crapthings

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]

diffusers diffusers copied to clipboard

NotImplementedError: Cannot copy out of meta tensor; no data! coz, Some weights of the model checkpoint were not used when initializing CLIPTextModelWithProjection

Describe the bug

Reproduction

Logs

System Info

Who can help?

diffusers
diffusers copied to clipboard