diffusers
diffusers copied to clipboard
[LoRA] attempt at fixing onetrainer lora.
What does this PR do?
Fixes https://github.com/huggingface/diffusers/issues/8237.
from diffusers import DiffusionPipeline
import torch
pipe_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipe = DiffusionPipeline.from_pretrained(pipe_id, torch_dtype=torch.float16).to("cuda")
pipe.load_lora_weights("sayakpaul/diffusers-8237-lora", weight_name="lora.safetensors", adapter_name="engraving")
lora_scale= 0.9
prompt = "Engravings Henri Meyer"
image = pipe(
prompt, num_inference_steps=30, cross_attention_kwargs={"scale": lora_scale}, generator=torch.manual_seed(0)
).images[0]
image
@BenjaminBossan I am seeing some unexpected warnings when running the above:
Loading adapter weights from None led to unexpected keys not found in the model: ['text_projection.lora.down.weight', 'text_projection.lora.up.weight'].
I have intentionally left some print statements for us to debug things in an easier manner. Here are the logs in an ordered fashion:
diffusers_name='text_projection.lora.down.weight', key='lora_te2_text_projection.lora_down.weight'
target_modules[:5]=['text_model.encoder.layers.3.self_attn.out_proj', 'text_model.encoder.layers.11.self_attn.v_proj', 'text_model.encoder.layers.6.self_attn.k_proj', 'text_model.encoder.layers.7.self_attn.k_proj', 'text_model.encoder.layers.6.mlp.fc1']
target_modules[:5]=['text_model.encoder.layers.16.mlp.fc1', 'text_model.encoder.layers.11.self_attn.v_proj', 'text_model.encoder.layers.7.self_attn.k_proj', 'text_model.encoder.layers.21.self_attn.v_proj', 'text_model.encoder.layers.26.mlp.fc2']
From load_lora_into_text_encoder: text_projection
Loading adapter weights from None led to unexpected keys not found in the model: ['text_projection.lora.down.weight', 'text_projection.lora.up.weight'].
I am on peft 0.11.1 and diffusers from the latest main. What am I missing here?
I can replicate the results. When it comes to this message
Loading adapter weights from None led to unexpected keys not found in the model: ['text_projection.lora.down.weight', 'text_projection.lora.up.weight'].
at first I thought it comes from diffusers or PEFT, but they don't actually contain this warning. Instead, it seems to come from transformers:
https://github.com/huggingface/transformers/blob/5a74ae6dbe84da6017546ebd3765da6cd08dbc40/src/transformers/integrations/peft.py#L219-L222
Honestly, I'm not sure how exactly transformers comes into play here. Does diffusers use transformers under the hood to load the PEFT weights? Maybe ping @younesbelkada
When jumping into the transformers code, I checked for potentially missing keys there and found these two:
'text_projection.lora_A.engraving.weight', 'text_projection.lora_B.engraving.weight'
Note that engraving is the name of the adapter in this case. So I suspect that when remapping the keys from this checkpoint, we need to somehow map to these names.
Btw., found a small typo while checking this:
https://github.com/huggingface/diffusers/blob/67b3fe0aaeddda19e8f61316a9b2fcceba4a4451/src/diffusers/loaders/lora_conversion_utils.py#L273
The following keys have not been correctly ~~be~~ renamed:
Honestly, I'm not sure how exactly transformers comes into play here. Does diffusers use transformers under the hood to load the PEFT weights?
It does. We rely solely on transformers for all things text. See here:
https://github.com/huggingface/diffusers/blob/67b3fe0aaeddda19e8f61316a9b2fcceba4a4451/src/diffusers/loaders/lora.py#L609
Note that engraving is the name of the adapter in this case. So I suspect that when remapping the keys from this checkpoint, we need to somehow map to these names.
Shouldn't that be handled by https://github.com/huggingface/diffusers/blob/67b3fe0aaeddda19e8f61316a9b2fcceba4a4451/src/diffusers/loaders/lora.py#L610 ?
I did some more digging and managed to make the unexpected keys disappear. For this, I had to use slightly different names for the arguments:
- te2_state_dict[diffusers_name] = state_dict.pop(key)
- te2_state_dict[diffusers_name.replace(".down.", ".up.")] = state_dict.pop(lora_name_up
+ te2_state_dict["text_projection.lora_A.weight"] = state_dict.pop(key)
+ te2_state_dict["text_projection.lora_B.weight"] = state_dict.pop(lora_name_up)
So basically text_projection.lora.down.weight -> text_projection.lora_A.weight and text_projection.lora.up.weight -> text_projection.lora_B.weight. Not sure if this the correct way of approaching things, but it seems to work.
Hmm. It's weird that we need to do this for this particular one. I will see if I can come up with something better.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Hmm. It's weird that we need to do this for this particular one. I will see if I can come up with something better.
Yes. IIUC, one is using (old?) diffusers nomenclature (lora.down, lora.up), whereas the other uses PEFT nomenclature (lora_A, lora_B). Is there another re-mapping step after this one that may somehow miss these particular keys?
There is. We decided to not touch this function because it's legacy and hence worked on an intermediate remapping.
https://github.com/huggingface/diffusers/blob/edf5ba6a17d012411c1fe3ceaf24f71f1899bc48/src/diffusers/utils/state_dict_utils.py#L172
Then my best bet would be that some entries need to be added to this dict:
https://github.com/huggingface/diffusers/blob/edf5ba6a17d012411c1fe3ceaf24f71f1899bc48/src/diffusers/utils/state_dict_utils.py#L67
That is what I am looking into at the moment.
@BenjaminBossan ready for review.
Thank you to all involved for prioritizing and resolving this issue superfast!