diffusers
diffusers copied to clipboard
Adding support for `safetensors` and LoRa.
Enabling safetensors support for the LoRA files:
Asked here: https://github.com/huggingface/safetensors/issues/180
Same attitude as the regular model weights.
If:
- safetensors is installed
- repo has the safetensors lora then it is the default.
Adding safe_serialization
on lora create so users can default to saving safetensors formats.
What's techically missing is the option to choose on load what the format is along with the weights_name. I ddin't want to add it here for simplicity (Since most users should be using the default anyway). But we could add that.
The documentation is not available anymore as the PR was closed or merged.
I don't think the failing test is linked to this PR, is it ?
Hi @Narsil! The code looks good to me, and the failing tests have nothing to do with it.
For additional context on top of the safetensors
issue you mentioned, people are interested in converting LoRA weights generated with other tools. See for example: https://github.com/huggingface/diffusers/issues/2363, https://github.com/huggingface/diffusers/pull/2403. And also the other way around (being able to use diffusers LoRA weights in other tools): https://github.com/huggingface/diffusers/issues/2326.
Focusing on the first task (converting from other tools to diffusers), I'm not sure how hard the problem is. In addition to changes in key names, some tools seem to do pivotal tuning or text inversion in addition to the cross-attention layers in our implementation. This PR seems to save the full pipeline instead of attempting to convert the incremental weights.
TL;DR: I think this PR is quite useful and necessary, but not sure if it will help towards the issue you mentioned :) (But I may be wrong, I still have to find the time to test this sort of interoperability).
Shall I merge ?
Please merge.
I would like to have @patil-suraj also review it. But he is currently on leave and should be back early next week. If it can wait, I would like to wait for that while.
Done.
Hey all, thanks for all the amazing work here.
I need to spend a bit more time on this but I think this introduces a regression. I have an integration test that calls:
// https://huggingface.co/patrickvonplaten/lora_dreambooth_dog_example/resolve/main/pytorch_lora_weights.bin
pipe.unet.load_attn_procs("pytorch_lora_weights.bin")
and is now failing with:
File "/api/diffusers/src/diffusers/loaders.py", line 170, in load_attn_procs
state_dict = safetensors.torch.load_file(model_file, device="cpu")
File "/opt/conda/envs/xformers/lib/python3.9/site-packages/safetensors/torch.py", line 98, in load_file
with safe_open(filename, framework="pt", device=device) as f:
Exception: Error while deserializing header: HeaderTooLarge
I guess because it's trying to load a non-safetensors file with safetensors.torch.load_file
? Here's the relevant code (last line is the failing line):
https://github.com/huggingface/diffusers/blob/1f4deb697f6204ae5da788b1e10c3074208ee57c/src/diffusers/loaders.py#L153-L170
There's no exception raised because the file indeed exists, it's just not in safetensors format. So I guess we need a safe_serialization
or from_safetensors
or some such kwarg maybe?
Looking at the surrounding code the easy way around this is to call torch.load()
myself and pass the state_dict
but I don't think that's a good approach. Let me know if I'm doing anything wrong but I think I should be able to specify an exact file, right?
Looking back at the original PR intro I guess this is exactly what @Narsil says:
What's techically missing is the option to choose on load what the format is along with the weights_name. I didn't want to add it here for simplicity (Since most users should be using the default anyway). But we could add that.
So should I open a new issue for this?
Thanks!
Ooops ! Thanks for notifying. I created a fix here: https://github.com/huggingface/diffusers/pull/2551
So fast! Thanks, @Narsil! :pray:
Exception: Error while deserializing header: HeaderTooLarge
still appears for the lora workflow (train + test)
@Ir1d can you provide a reproducible workflow (ideally fast to execute) ?
@Narsil would something like this work now?
model = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
model.scheduler = DPMSolverMultistepScheduler.from_config(model.scheduler.config)
model.unet.load_attn_procs(model_path, use_safetensors=True) ###model_path = 'xxx.safetensors'
@Narsil would something like this work now?
model = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16) model.scheduler = DPMSolverMultistepScheduler.from_config(model.scheduler.config) model.unet.load_attn_procs(model_path, use_safetensors=True) ###model_path = 'xxx.safetensors'
I tried this in my env but it's not working. I am also wondering a way to load .safetensors instead of the pytorch_lora_weights.bin, any ideas?
Do you have links to the model_path
you're referring to ?
Here is a modified version of your scripts that creates the proper LoRA safetensors file:
from diffusers import *
import torch
from diffusers.models.attention_processor import AttnProcessor, LoRAAttnProcessor
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
model = pipe.unet
lora_attn_procs = {}
for name in model.attn_processors.keys():
cross_attention_dim = None if name.endswith("attn1.processor") else model.config.cross_attention_dim
if name.startswith("mid_block"):
hidden_size = model.config.block_out_channels[-1]
elif name.startswith("up_blocks"):
block_id = int(name[len("up_blocks.")])
hidden_size = list(reversed(model.config.block_out_channels))[block_id]
elif name.startswith("down_blocks"):
block_id = int(name[len("down_blocks.")])
hidden_size = model.config.block_out_channels[block_id]
lora_attn_procs[name] = LoRAAttnProcessor(
hidden_size=hidden_size, cross_attention_dim=cross_attention_dim
)
lora_attn_procs[name] = lora_attn_procs[name].to(model.device)
# add 1 to weights to mock trained weights
with torch.no_grad():
lora_attn_procs[name].to_q_lora.up.weight += 1
lora_attn_procs[name].to_k_lora.up.weight += 1
lora_attn_procs[name].to_v_lora.up.weight += 1
lora_attn_procs[name].to_out_lora.up.weight += 1
model.set_attn_processor(lora_attn_procs)
model.save_attn_procs("./out", safe_serialization=True)
model.load_attn_procs("./out", use_safetensors=True) ###model_path = 'xxx.safetensors'
This should work ? HeaderTooLarge error seems to indicate the file you're having is corrupted in some way or isn't a safetensors file to begin with.
So our current workflow is use convert_lora_safetensor_to_diffusers.py to merge a lora to its base model, then if we want to separate it and use it like a native lora in diffuser we use this script? @Narsil
from diffusers import * import torch from diffusers.models.attention_processor import AttnProcessor, LoRAAttnProcessor pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16) pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) model = pipe.unet lora_attn_procs = {} for name in model.attn_processors.keys(): cross_attention_dim = None if name.endswith("attn1.processor") else model.config.cross_attention_dim if name.startswith("mid_block"): hidden_size = model.config.block_out_channels[-1] elif name.startswith("up_blocks"): block_id = int(name[len("up_blocks.")]) hidden_size = list(reversed(model.config.block_out_channels))[block_id] elif name.startswith("down_blocks"): block_id = int(name[len("down_blocks.")]) hidden_size = model.config.block_out_channels[block_id] lora_attn_procs[name] = LoRAAttnProcessor( hidden_size=hidden_size, cross_attention_dim=cross_attention_dim ) lora_attn_procs[name] = lora_attn_procs[name].to(model.device) # add 1 to weights to mock trained weights with torch.no_grad(): lora_attn_procs[name].to_q_lora.up.weight += 1 lora_attn_procs[name].to_k_lora.up.weight += 1 lora_attn_procs[name].to_v_lora.up.weight += 1 lora_attn_procs[name].to_out_lora.up.weight += 1 model.set_attn_processor(lora_attn_procs) model.save_attn_procs("./out", safe_serialization=True) model.load_attn_procs("./out", use_safetensors=True) ###model_path = 'xxx.safetensors'
So our current workflow is use convert_lora_safetensor_to_diffusers.py to merge a lora to its base model, then if we want to separate it and use it like a native lora in diffuser we use this script? @Narsil
Sorry I'm not familiar with this workflow nor this particular. Do you have some script+workflow I could reproduce to try and reproduce the faulty file ? My guess is that something went wrong during conversion leading to a bad file since everything look relatively straightforward in that script.
That, or the safe_serialization=True
wasn't working properly somehow.
Sorry I'm not familiar with this workflow nor this particular. Do you have some script+workflow I could reproduce to try and reproduce the faulty file ? My guess is that something went wrong during conversion leading to a bad file since everything look relatively straightforward in that script. That, or the
safe_serialization=True
wasn't working properly somehow.
Sorry for misunderstanding, I'm trying to use a lora/ckpt from civitai with diffusers, and I wonder what's the correct way.
My attempt is downloading a civitai weight, convert it by this scripts. It works perfectly, and I can load it by StableDiffusionPipeline.from_pretrained
.
Then I would like to use it with a lora, the suggested way seems to be
pipe.unet.load_attn_procs("lora_path", use_safetensors=True)
but that raised an KeyError: 'to_k_lora.down.weight'.
https://github.com/huggingface/diffusers/blob/main/scripts/convert_lora_safetensor_to_diffusers.py can merge it with its base model, this works but gave me a huge model, which cannot work easily with other base models. I wonder if we need to fuse it with the base model first and save it in the diffuser format using the script above in order to obtain a lightweight lora replica(like it original be).
but that raised an KeyError: 'to_k_lora.down.weight'.
This means the LoRA is still in SD format, and you need to change it to diffusers
format I guess.
@pcuenca Might know more ?
I want to know how to do it?
Duplicate of https://github.com/huggingface/diffusers/pull/2551#issuecomment-1487134319 => let's make this a feature request