diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Create train_dreambooth_inpaint.py

Open thedarkzeno opened this issue 1 year ago • 8 comments

train_dreambooth.py adapted to work with the inpaint model, generating random masks during the training

thedarkzeno avatar Nov 01 '22 00:11 thedarkzeno

The documentation is not available anymore as the PR was closed or merged.

Interesting! This would be a cool addition if it works well :-)

patrickvonplaten avatar Nov 02 '22 17:11 patrickvonplaten

Gentle ping here @patil-suraj

patrickvonplaten avatar Nov 16 '22 16:11 patrickvonplaten

Hey guys, I'm thinking of adding the option to create the mask with clipseg instead of just using random masks, what do you think? I believe it could improve training by masking the area of interest. @patil-suraj @patrickvonplaten

thedarkzeno avatar Nov 18 '22 19:11 thedarkzeno

Reping @patil-suraj

patrickvonplaten avatar Nov 20 '22 19:11 patrickvonplaten

can you please adapt this to this colab https://github.com/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb ,this colab trains based only on the name of the images, without class images and that complicated stuff

loboere avatar Nov 21 '22 20:11 loboere

Hey @loboere,

Note that this colab is not part of the diffusers repo - could you please leave an issue on https://github.com/TheLastBen/fast-stable-diffusion/ ?

patrickvonplaten avatar Nov 28 '22 12:11 patrickvonplaten

Note that this colab is not part of the diffusers repo - could you please leave an issue on https://github.com/TheLastBen/fast-stable-diffusion/ ?

@TheLastBen FYI

0xdevalias avatar Nov 28 '22 20:11 0xdevalias

@patil-suraj ping again

patrickvonplaten avatar Dec 01 '22 16:12 patrickvonplaten

@thedarkzeno @patil-suraj seems to busy to review the PR at the moment.

Let's just go for it :-) Could you however please add a section to https://github.com/huggingface/diffusers/tree/main/examples/dreambooth explaining how to use your script?

patrickvonplaten avatar Dec 01 '22 16:12 patrickvonplaten

Hey @patrickvonplaten, sure. I think I'll just have to make a few adjustments to support stable diffusion v2.

thedarkzeno avatar Dec 02 '22 01:12 thedarkzeno

Awesome let's merge it :-)

@williamberman @patil-suraj it would be great if you could give it a spin :-)

patrickvonplaten avatar Dec 02 '22 17:12 patrickvonplaten

Was just looking, and this doesn't seem to be available at the following:

  • https://github.com/huggingface/diffusers/tree/main/examples/dreambooth

Why not/where did it go?


Edit: Digging into the commit history, I see the following that seem to have touched it:

  • https://github.com/huggingface/diffusers/commits/main/examples/dreambooth/train_dreambooth_inpaint.py
    • https://github.com/huggingface/diffusers/pull/1549
    • https://github.com/huggingface/diffusers/pull/1529
    • https://github.com/huggingface/diffusers/pull/1553

Specifically, it seems that #1553 was the one that moved it, and it now lives at:

  • https://github.com/huggingface/diffusers/tree/main/examples/research_projects/dreambooth_inpaint

0xdevalias avatar Dec 08 '22 23:12 0xdevalias

can you please create a colab to test this and have it work on a t4 gpu

loboere avatar Dec 09 '22 00:12 loboere

I tried to train in collaboration with the same photos of cat toy but the results are a disaster, it seems to corrupt the original inpainting model, I don't know what's wrong.

!accelerate launch train_dreambooth_inpaint.py
--pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting"
--instance_data_dir="./my_concept"
--output_dir="dreambooth-concept"
--instance_prompt="skdfklsdlfksd"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--learning_rate=5e-6
--lr_scheduler="constant"
--lr_warmup_steps=100
--gradient_checkpointing
--use_8bit_adam
--mixed_precision="fp16"
--max_train_steps=600

after training load the model

#load pipeline inpainting

from diffusers import StableDiffusionInpaintPipeline
import torch

pipe = StableDiffusionInpaintPipeline.from_pretrained(
     "/content/dreambooth-concept",
     torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")

a skdfklsdlfksd photo descarga

also with normal objects

dog photo image

loboere avatar Jan 06 '23 19:01 loboere

Hello @loboere I tryed with --use_8bit_adam and got bad results as well, but with different params my results were better.

accelerate launch dreambooth_inpaint.py ^ --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting" ^ --instance_data_dir="./toy_cat" ^ --output_dir="./dreambooth_ad_inpaint_toy_cat" ^ --instance_prompt="toy cat" ^ --resolution=512 ^ --train_batch_size=1 ^ --learning_rate=5e-6 ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --gradient_accumulation_steps=2 ^ --gradient_checkpointing ^ --train_text_encoder

Training with those params this was my result. download

maybe something with the 8bit_adam is not working as intended.

thedarkzeno avatar Jan 07 '23 17:01 thedarkzeno

Hey @thedarkzeno I tried using your script (and using the related requirements), but ran into this error related to CrossAttnDownBlock2D

Traceback (most recent call last):
  File "train_dreambooth_inpaint.py", line 799, in <module>
    main()
  File "train_dreambooth_inpaint.py", line 493, in main
    vae = AutoencoderKL.from_pretrained(args.pretrained_model_name_or_path, subfolder="vae")
  File "/usr/local/lib/python3.8/dist-packages/diffusers/modeling_utils.py", line 483, in from_pretrained
    model = cls.from_config(config, **unused_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/diffusers/configuration_utils.py", line 210, in from_config
    model = cls(**init_dict)
  File "/usr/local/lib/python3.8/dist-packages/diffusers/configuration_utils.py", line 567, in inner_init
    init(self, *args, **init_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/diffusers/models/vae.py", line 539, in __init__
    self.encoder = Encoder(
  File "/usr/local/lib/python3.8/dist-packages/diffusers/models/vae.py", line 94, in __init__
    down_block = get_down_block(
  File "/usr/local/lib/python3.8/dist-packages/diffusers/models/unet_2d_blocks.py", line 83, in get_down_block
    raise ValueError("cross_attention_dim must be specified for CrossAttnDownBlock2D")
ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D

Did you use a specific release of runwayml/stable-diffusion-inpainting by any chance?

Aldo-Aditiya avatar Jan 10 '23 08:01 Aldo-Aditiya

Hello @loboere I tryed with --use_8bit_adam and got bad results as well, but with different params my results were better.

accelerate launch dreambooth_inpaint.py ^ --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting" ^ --instance_data_dir="./toy_cat" ^ --output_dir="./dreambooth_ad_inpaint_toy_cat" ^ --instance_prompt="toy cat" ^ --resolution=512 ^ --train_batch_size=1 ^ --learning_rate=5e-6 ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --gradient_accumulation_steps=2 ^ --gradient_checkpointing ^ --train_text_encoder

Training with those params this was my result. download

maybe something with the 8bit_adam is not working as intended.

@thedarkzeno I was following the same settings you have mentioned but my loss is not decreasing. Any help would be appriciated.

image

kunalgoyal9 avatar Feb 05 '23 19:02 kunalgoyal9

Hello @kunalgoyal9, sometimes the loss doesn't decrease, but you can still get good results, did you check the outputs from your model?

thedarkzeno avatar Feb 05 '23 21:02 thedarkzeno

@thedarkzeno Thanks for your reply... output is also not good.. I used four toy_cat images and testing using prompt "a toy cat sitting on a bench" image

kunalgoyal9 avatar Feb 06 '23 10:02 kunalgoyal9

can you try using just "toy cat" as prompt?

thedarkzeno avatar Feb 06 '23 21:02 thedarkzeno

Hello @Aldo-Aditiya

If there is a file named "config.json" under the cloned "stable-diffusion-inpainting" repository, the following error will occur.

ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D

I removed "config.json", and the error disappeared.

Hope this helps.

dai-ichiro avatar Feb 07 '23 03:02 dai-ichiro

Hey folks! We're trying to encourage the forum for open ended discussion :) Might be good to make a thread there for future dreambooth inpainting discussion https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63

williamberman avatar Feb 13 '23 01:02 williamberman

Hello @loboere I tryed with --use_8bit_adam and got bad results as well, but with different params my results were better.

accelerate launch dreambooth_inpaint.py ^ --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting" ^ --instance_data_dir="./toy_cat" ^ --output_dir="./dreambooth_ad_inpaint_toy_cat" ^ --instance_prompt="toy cat" ^ --resolution=512 ^ --train_batch_size=1 ^ --learning_rate=5e-6 ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --gradient_accumulation_steps=2 ^ --gradient_checkpointing ^ --train_text_encoder

Training with those params this was my result. download

maybe something with the 8bit_adam is not working as intended.

Hi, I have tried as following your script and it works. However I would like to know if I have many pairs of images and their captions, how I can train the dataset correctly? You set the 'instance_prompt' just one string value "toy_cat", but I want to train many prompts. For example, "bed", "chair", "sofa", "wardrobe", ... for each image. Do you have any idea for it?

JANGSOONMYUN avatar Dec 04 '23 09:12 JANGSOONMYUN

@JANGSOONMYUN you have to modify the code to support your data, I suggest you take a look at the text-to-image script here

thedarkzeno avatar Dec 05 '23 14:12 thedarkzeno

@JANGSOONMYUN you have to modify the code to support your data, I suggest you take a look at the text-to-image script here

Ok thank you!

JANGSOONMYUN avatar Dec 06 '23 02:12 JANGSOONMYUN