diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

LoRA load issue

Open D1-3105 opened this issue 5 months ago • 5 comments

Describe the bug

FluxPipeline fails to load LoRA weights (https://civitai.com/models/876464):

Reproduction

from diffusers import FluxPipeline
from torch import bfloat16

prompt_from_repo = "Beautiful female woodelf in shiny iron and silver armor, full body view, full body shot, feet and arms seen in the view, hyper photorealistic, ultra photorealistic, looking straight into camera, Elf, pointy ears, photorealistic, fighting pose, shoulder length wavy red hair, full body, athletic figure, beautiful green eyes, 8k, ultra realistic, hyper realistic, highly detailed, daylight, fantasy style, high contrast, colorful polychromatic, (Intricately Designed face: 1.3), dramatic lighting, wet and sweaty skin, dirty skin, Cinematic colorful lighting, realistic body and face, Photograph Taken on Nikon D750, Intricate, Elegant, Digital Illustration, Scenic, Hyper-Realistic, Unreal Engine, CryEngine, Octane Render, Artgerm, WLOP, Greg Rutkowski, 8k ultra high resolution concept art, hyper-defined, sharp focus, echoing, (hyperdetailed body and face: 1.3), asymmetric balance, Stunning and amazing.,Acrylian4,Lineart style:Charcoal lineart and white chalk lineart,Tannis, Tanis, Esmeralda , piercing multicolor colorful eye color,black ink lineart, dynamic pose, Amazing artwork, a Masterpiece painting"
negative_prompt_from_repo = "score_6,score_5,score_4, , poor artist, bad artist, ugly, poorly drawn, poorly detailed, warped, morphed, uncanny, horror, disfigured, score_4, score_5, score_6, source_furry, wrong hand, bad hands, bad anatomy, fewer digits, bad perspective, bad proportions, bad arm, extra arms, cross-eyed, logo, censored, blurry, lowres, artistic error, watermark, jaggy lines, deformed, monochrome, bad eyes embedding:EasyNegative"
lora_id = "D1-3105/lora_876464__issue11659__DEV-2308"
model_id = "black-forest-labs/FLUX.1-dev"

pipeline = FluxPipeline.from_pretrained(model_id, cache_dir="train_model/huggingface", torch_dtype=bfloat16).to("cuda")

pipeline.load_lora_weights(lora_id)
assert pipeline.get_active_adapters()
image = pipeline(
    prompt=prompt_from_repo,
    negative_prompt=negative_prompt_from_repo,
    width = 768,
    height = 1152,
    seed=442779123,
    steps=25,
    guidance_scale=25,
).images[0]

Logs

No LoRA keys associated to FluxTransformer2DModel found with the prefix='transformer'. This is safe to ignore if LoRA state dict didn't originally have any FluxTransformer2DModel related params. You can also try specifying `prefix=None` to resolve the warning. Otherwise, open an issue if you think it's unexpected: https://github.com/huggingface/diffusers/issues/new
No LoRA keys associated to CLIPTextModel found with the prefix='text_encoder'. This is safe to ignore if LoRA state dict didn't originally have any CLIPTextModel related params. You can also try specifying `prefix=None` to resolve the warning. Otherwise, open an issue if you think it's unexpected: https://github.com/huggingface/diffusers/issues/new

System Info

diffusers==0.33.1

Who can help?

@sayakpaul

D1-3105 avatar Jun 04 '25 22:06 D1-3105

Could you upload the file to a HF hub repository and update the codebase with that please?

sayakpaul avatar Jun 05 '25 01:06 sayakpaul

Done

D1-3105 avatar Jun 05 '25 19:06 D1-3105

The underlying state dict seems to be with LoKR and NOT LoRA. This is not supported yet.

@BenjaminBossan what would the best way to support this through the existing functionalities we have in diffusers and peft? I am guessing we will have to defer to LoKrConfig and do something similar to what we do for LoRA? In that case, I think we will need to think about the entrypoint for this i.e., whether to go from load_lora_weights() or introduce load_lokr_weights(), etc.

sayakpaul avatar Jun 06 '25 06:06 sayakpaul

From the PEFT side of things, LoRA, LoKr, LoHa etc. mostly share the same interface. This means that hopefully, that part should be well covered. Just as an example, code like this should just work:

https://github.com/huggingface/diffusers/blob/16c955c5fdff7dc427488eb691411bcb2bedd68d/src/diffusers/utils/peft_utils.py#L207-L216

Some methods like scale_layer would need to be added for LoKr and LoHa, but that should not be too difficult.

There is a lot of code in diffusers that is LoRA-specific though, in particular mostly everything in lora_base.py. On top of that, there is the question about the API, whether we want to have load_lokr_weights() etc. or rather a more generic load_adapter like in transformers.

Overall, I think supporting this would be quite a lot of work, even if we don't aim for feature parity with LoRA. I can't say if it's worth it or if LoKr et al. are too niche.

BenjaminBossan avatar Jun 06 '25 09:06 BenjaminBossan

@DN6 what's your take? I don't mind the work but need to discuss the entrypoints for this a bit.

We could aim for load_adapter() on the pipeline level and delegate to public methods like load_lora_weights() and load_lokr_weights(), etc. for example.

sayakpaul avatar Jun 06 '25 09:06 sayakpaul