diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Add B-Lora training option to the advanced dreambooth lora script

Open linoytsaban opened this issue 10 months ago • 4 comments

Adds an option to train a dreambooth lora targeting using a single ref image + targeting specific unet blocks only, subsequently allowing for easy combination of style/content LoRAs, as proposed in Implicit Style-Content Separation using B-LoRA

linoytsaban avatar Apr 22 '24 15:04 linoytsaban

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

some comparisons with InstantStyle: Group 1-13

while B-LoRA requires optimization, it can have more robust results depending on the use case+ it requires very light optimization (i.e. I used the default settings for all experiments which consist of a single training image, and 1000 training steps)

Another color feature is that once a B-LoRA is trained, you can both load it for content or for style (based on the blocks loaded), as well as mix & match with other LoRAs, for example Group 2-8

Group 3-6

linoytsaban avatar Apr 24 '24 12:04 linoytsaban

I was thinking of doing a guide/blog about LoRA training and this would be really cool to add.

There's one thing on my mind though, I really wouldn't restrict it to just B-Lora or specifics layers, but since the training scripts are examples maybe the simpler the better.

I read somewhere that you can test a trained LoRA loading specific layers/blocks to test the best results and then re-train it using those layers to make a even better LoRA, that's the use case that I want to write about but I can just modify this script to do that.

asomoza avatar Apr 24 '24 15:04 asomoza

@asomoza Nice! we could maybe follow up on some of the experiments/features we covered here: https://huggingface.co/blog/sdxl_lora_advanced_script

I think that could be a cool avenue to explore, and targeting specific blocks using the current script can be made possible quite easily (basically same as here but as you said not restricted) also for re-training

linoytsaban avatar Apr 24 '24 17:04 linoytsaban

thanks for the cool feature @linoytsaban did you try anything with a few images or a large set to see if it scales at all? or does it require a single img for this to function?

bghira avatar Apr 28 '24 19:04 bghira

Made a few changes to readme and inference re: making lora unet blocks generally configurable If you do mind taking a final look @apolinario @sayakpaul @asomoza , and then 🛳️

linoytsaban avatar Apr 29 '24 16:04 linoytsaban

LGTM! Thanks for eliminating the dependency altogether.

Will merge once the CI is green pass.

sayakpaul avatar Apr 30 '24 04:04 sayakpaul