diffusers Adding support for `safetensors` and LoRa.

Enabling safetensors support for the LoRA files:

Asked here: https://github.com/huggingface/safetensors/issues/180

Same attitude as the regular model weights.

If:

safetensors is installed
repo has the safetensors lora then it is the default.

Adding safe_serialization on lora create so users can default to saving safetensors formats.

What's techically missing is the option to choose on load what the format is along with the weights_name. I ddin't want to add it here for simplicity (Since most users should be using the default anyway). But we could add that.

Feb 21 '23 10:02 Narsil

The documentation is not available anymore as the PR was closed or merged.

Feb 21 '23 10:02 HuggingFaceDocBuilderDev

I don't think the failing test is linked to this PR, is it ?

Feb 21 '23 11:02 Narsil

Hi @Narsil! The code looks good to me, and the failing tests have nothing to do with it.

For additional context on top of the safetensors issue you mentioned, people are interested in converting LoRA weights generated with other tools. See for example: https://github.com/huggingface/diffusers/issues/2363, https://github.com/huggingface/diffusers/pull/2403. And also the other way around (being able to use diffusers LoRA weights in other tools): https://github.com/huggingface/diffusers/issues/2326.

Focusing on the first task (converting from other tools to diffusers), I'm not sure how hard the problem is. In addition to changes in key names, some tools seem to do pivotal tuning or text inversion in addition to the cross-attention layers in our implementation. This PR seems to save the full pipeline instead of attempting to convert the incremental weights.

TL;DR: I think this PR is quite useful and necessary, but not sure if it will help towards the issue you mentioned :) (But I may be wrong, I still have to find the time to test this sort of interoperability).

Feb 21 '23 19:02 pcuenca

Shall I merge ?

Feb 24 '23 09:02 Narsil

Please merge.

Feb 24 '23 10:02 jochemstoel

I would like to have @patil-suraj also review it. But he is currently on leave and should be back early next week. If it can wait, I would like to wait for that while.

Feb 24 '23 10:02 sayakpaul

Done.

Mar 03 '23 12:03 Narsil

Hey all, thanks for all the amazing work here.

I need to spend a bit more time on this but I think this introduces a regression. I have an integration test that calls:

// https://huggingface.co/patrickvonplaten/lora_dreambooth_dog_example/resolve/main/pytorch_lora_weights.bin
pipe.unet.load_attn_procs("pytorch_lora_weights.bin")

and is now failing with:

  File "/api/diffusers/src/diffusers/loaders.py", line 170, in load_attn_procs
    state_dict = safetensors.torch.load_file(model_file, device="cpu")
  File "/opt/conda/envs/xformers/lib/python3.9/site-packages/safetensors/torch.py", line 98, in load_file
    with safe_open(filename, framework="pt", device=device) as f:
Exception: Error while deserializing header: HeaderTooLarge

I guess because it's trying to load a non-safetensors file with safetensors.torch.load_file ? Here's the relevant code (last line is the failing line):

https://github.com/huggingface/diffusers/blob/1f4deb697f6204ae5da788b1e10c3074208ee57c/src/diffusers/loaders.py#L153-L170

There's no exception raised because the file indeed exists, it's just not in safetensors format. So I guess we need a safe_serialization or from_safetensors or some such kwarg maybe?

Looking at the surrounding code the easy way around this is to call torch.load() myself and pass the state_dict but I don't think that's a good approach. Let me know if I'm doing anything wrong but I think I should be able to specify an exact file, right?

Looking back at the original PR intro I guess this is exactly what @Narsil says:

What's techically missing is the option to choose on load what the format is along with the weights_name. I didn't want to add it here for simplicity (Since most users should be using the default anyway). But we could add that.

So should I open a new issue for this?

Thanks!

Mar 03 '23 18:03 gadicc

Ooops ! Thanks for notifying. I created a fix here: https://github.com/huggingface/diffusers/pull/2551

Mar 04 '23 10:03 Narsil

So fast! Thanks, @Narsil! :pray:

Mar 04 '23 10:03 gadicc

Exception: Error while deserializing header: HeaderTooLarge still appears for the lora workflow (train + test)

Mar 20 '23 02:03 Ir1d

@Ir1d can you provide a reproducible workflow (ideally fast to execute) ?

Mar 20 '23 09:03 Narsil

@Narsil would something like this work now?

model = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
model.scheduler = DPMSolverMultistepScheduler.from_config(model.scheduler.config)
model.unet.load_attn_procs(model_path, use_safetensors=True) ###model_path = 'xxx.safetensors'

Mar 21 '23 14:03 kilimchoi

@Narsil would something like this work now?

model = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
model.scheduler = DPMSolverMultistepScheduler.from_config(model.scheduler.config)
model.unet.load_attn_procs(model_path, use_safetensors=True) ###model_path = 'xxx.safetensors'

I tried this in my env but it's not working. I am also wondering a way to load .safetensors instead of the pytorch_lora_weights.bin, any ideas?

Mar 22 '23 08:03 teaguexiao

Do you have links to the model_path you're referring to ?

Here is a modified version of your scripts that creates the proper LoRA safetensors file:

from diffusers import *
import torch
from diffusers.models.attention_processor import AttnProcessor, LoRAAttnProcessor

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

model = pipe.unet
lora_attn_procs = {}
for name in model.attn_processors.keys():
    cross_attention_dim = None if name.endswith("attn1.processor") else model.config.cross_attention_dim
    if name.startswith("mid_block"):
        hidden_size = model.config.block_out_channels[-1]
    elif name.startswith("up_blocks"):
        block_id = int(name[len("up_blocks.")])
        hidden_size = list(reversed(model.config.block_out_channels))[block_id]
    elif name.startswith("down_blocks"):
        block_id = int(name[len("down_blocks.")])
        hidden_size = model.config.block_out_channels[block_id]

    lora_attn_procs[name] = LoRAAttnProcessor(
        hidden_size=hidden_size, cross_attention_dim=cross_attention_dim
    )
    lora_attn_procs[name] = lora_attn_procs[name].to(model.device)

    # add 1 to weights to mock trained weights
    with torch.no_grad():
        lora_attn_procs[name].to_q_lora.up.weight += 1
        lora_attn_procs[name].to_k_lora.up.weight += 1
        lora_attn_procs[name].to_v_lora.up.weight += 1
        lora_attn_procs[name].to_out_lora.up.weight += 1

model.set_attn_processor(lora_attn_procs)
model.save_attn_procs("./out", safe_serialization=True)
model.load_attn_procs("./out", use_safetensors=True) ###model_path = 'xxx.safetensors'

This should work ? HeaderTooLarge error seems to indicate the file you're having is corrupted in some way or isn't a safetensors file to begin with.

Mar 22 '23 10:03 Narsil

So our current workflow is use convert_lora_safetensor_to_diffusers.py to merge a lora to its base model, then if we want to separate it and use it like a native lora in diffuser we use this script? @Narsil

from diffusers import *
import torch
from diffusers.models.attention_processor import AttnProcessor, LoRAAttnProcessor

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

model = pipe.unet
lora_attn_procs = {}
for name in model.attn_processors.keys():
    cross_attention_dim = None if name.endswith("attn1.processor") else model.config.cross_attention_dim
    if name.startswith("mid_block"):
        hidden_size = model.config.block_out_channels[-1]
    elif name.startswith("up_blocks"):
        block_id = int(name[len("up_blocks.")])
        hidden_size = list(reversed(model.config.block_out_channels))[block_id]
    elif name.startswith("down_blocks"):
        block_id = int(name[len("down_blocks.")])
        hidden_size = model.config.block_out_channels[block_id]

    lora_attn_procs[name] = LoRAAttnProcessor(
        hidden_size=hidden_size, cross_attention_dim=cross_attention_dim
    )
    lora_attn_procs[name] = lora_attn_procs[name].to(model.device)

    # add 1 to weights to mock trained weights
    with torch.no_grad():
        lora_attn_procs[name].to_q_lora.up.weight += 1
        lora_attn_procs[name].to_k_lora.up.weight += 1
        lora_attn_procs[name].to_v_lora.up.weight += 1
        lora_attn_procs[name].to_out_lora.up.weight += 1

model.set_attn_processor(lora_attn_procs)
model.save_attn_procs("./out", safe_serialization=True)
model.load_attn_procs("./out", use_safetensors=True) ###model_path = 'xxx.safetensors'

Mar 23 '23 09:03 fecet

So our current workflow is use convert_lora_safetensor_to_diffusers.py to merge a lora to its base model, then if we want to separate it and use it like a native lora in diffuser we use this script? @Narsil

Sorry I'm not familiar with this workflow nor this particular. Do you have some script+workflow I could reproduce to try and reproduce the faulty file ? My guess is that something went wrong during conversion leading to a bad file since everything look relatively straightforward in that script. That, or the safe_serialization=True wasn't working properly somehow.

Mar 23 '23 09:03 Narsil

Sorry I'm not familiar with this workflow nor this particular. Do you have some script+workflow I could reproduce to try and reproduce the faulty file ? My guess is that something went wrong during conversion leading to a bad file since everything look relatively straightforward in that script. That, or the safe_serialization=True wasn't working properly somehow.

Sorry for misunderstanding, I'm trying to use a lora/ckpt from civitai with diffusers, and I wonder what's the correct way. My attempt is downloading a civitai weight, convert it by this scripts. It works perfectly, and I can load it by StableDiffusionPipeline.from_pretrained.

Then I would like to use it with a lora, the suggested way seems to be

pipe.unet.load_attn_procs("lora_path", use_safetensors=True)

but that raised an KeyError: 'to_k_lora.down.weight'.

https://github.com/huggingface/diffusers/blob/main/scripts/convert_lora_safetensor_to_diffusers.py can merge it with its base model, this works but gave me a huge model, which cannot work easily with other base models. I wonder if we need to fuse it with the base model first and save it in the diffuser format using the script above in order to obtain a lightweight lora replica(like it original be).

Mar 23 '23 09:03 fecet

but that raised an KeyError: 'to_k_lora.down.weight'.

This means the LoRA is still in SD format, and you need to change it to diffusers format I guess. @pcuenca Might know more ?

Mar 23 '23 11:03 Narsil

I want to know how to do it?

Mar 28 '23 10:03 Youngboy12

Duplicate of https://github.com/huggingface/diffusers/pull/2551#issuecomment-1487134319 => let's make this a feature request

Mar 28 '23 16:03 patrickvonplaten