diffusers
diffusers copied to clipboard
Create train_dreambooth_inpaint.py
train_dreambooth.py adapted to work with the inpaint model, generating random masks during the training
The documentation is not available anymore as the PR was closed or merged.
Interesting! This would be a cool addition if it works well :-)
Gentle ping here @patil-suraj
Hey guys, I'm thinking of adding the option to create the mask with clipseg instead of just using random masks, what do you think? I believe it could improve training by masking the area of interest. @patil-suraj @patrickvonplaten
Reping @patil-suraj
can you please adapt this to this colab https://github.com/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb ,this colab trains based only on the name of the images, without class images and that complicated stuff
Hey @loboere,
Note that this colab is not part of the diffusers repo - could you please leave an issue on https://github.com/TheLastBen/fast-stable-diffusion/ ?
Note that this colab is not part of the diffusers repo - could you please leave an issue on https://github.com/TheLastBen/fast-stable-diffusion/ ?
@TheLastBen FYI
@patil-suraj ping again
@thedarkzeno @patil-suraj seems to busy to review the PR at the moment.
Let's just go for it :-) Could you however please add a section to https://github.com/huggingface/diffusers/tree/main/examples/dreambooth explaining how to use your script?
Hey @patrickvonplaten, sure. I think I'll just have to make a few adjustments to support stable diffusion v2.
Awesome let's merge it :-)
@williamberman @patil-suraj it would be great if you could give it a spin :-)
Was just looking, and this doesn't seem to be available at the following:
- https://github.com/huggingface/diffusers/tree/main/examples/dreambooth
Why not/where did it go?
Edit: Digging into the commit history, I see the following that seem to have touched it:
- https://github.com/huggingface/diffusers/commits/main/examples/dreambooth/train_dreambooth_inpaint.py
- https://github.com/huggingface/diffusers/pull/1549
- https://github.com/huggingface/diffusers/pull/1529
- https://github.com/huggingface/diffusers/pull/1553
Specifically, it seems that #1553 was the one that moved it, and it now lives at:
- https://github.com/huggingface/diffusers/tree/main/examples/research_projects/dreambooth_inpaint
can you please create a colab to test this and have it work on a t4 gpu
I tried to train in collaboration with the same photos of cat toy but the results are a disaster, it seems to corrupt the original inpainting model, I don't know what's wrong.
!accelerate launch train_dreambooth_inpaint.py
--pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting"
--instance_data_dir="./my_concept"
--output_dir="dreambooth-concept"
--instance_prompt="skdfklsdlfksd"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--learning_rate=5e-6
--lr_scheduler="constant"
--lr_warmup_steps=100
--gradient_checkpointing
--use_8bit_adam
--mixed_precision="fp16"
--max_train_steps=600
after training load the model
#load pipeline inpainting
from diffusers import StableDiffusionInpaintPipeline
import torch
pipe = StableDiffusionInpaintPipeline.from_pretrained(
"/content/dreambooth-concept",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
a skdfklsdlfksd photo
also with normal objects
dog photo
Hello @loboere I tryed with --use_8bit_adam and got bad results as well, but with different params my results were better.
accelerate launch dreambooth_inpaint.py ^ --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting" ^ --instance_data_dir="./toy_cat" ^ --output_dir="./dreambooth_ad_inpaint_toy_cat" ^ --instance_prompt="toy cat" ^ --resolution=512 ^ --train_batch_size=1 ^ --learning_rate=5e-6 ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --gradient_accumulation_steps=2 ^ --gradient_checkpointing ^ --train_text_encoder
Training with those params this was my result.
maybe something with the 8bit_adam is not working as intended.
Hey @thedarkzeno I tried using your script (and using the related requirements), but ran into this error related to CrossAttnDownBlock2D
Traceback (most recent call last):
File "train_dreambooth_inpaint.py", line 799, in <module>
main()
File "train_dreambooth_inpaint.py", line 493, in main
vae = AutoencoderKL.from_pretrained(args.pretrained_model_name_or_path, subfolder="vae")
File "/usr/local/lib/python3.8/dist-packages/diffusers/modeling_utils.py", line 483, in from_pretrained
model = cls.from_config(config, **unused_kwargs)
File "/usr/local/lib/python3.8/dist-packages/diffusers/configuration_utils.py", line 210, in from_config
model = cls(**init_dict)
File "/usr/local/lib/python3.8/dist-packages/diffusers/configuration_utils.py", line 567, in inner_init
init(self, *args, **init_kwargs)
File "/usr/local/lib/python3.8/dist-packages/diffusers/models/vae.py", line 539, in __init__
self.encoder = Encoder(
File "/usr/local/lib/python3.8/dist-packages/diffusers/models/vae.py", line 94, in __init__
down_block = get_down_block(
File "/usr/local/lib/python3.8/dist-packages/diffusers/models/unet_2d_blocks.py", line 83, in get_down_block
raise ValueError("cross_attention_dim must be specified for CrossAttnDownBlock2D")
ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D
Did you use a specific release of runwayml/stable-diffusion-inpainting
by any chance?
Hello @loboere I tryed with --use_8bit_adam and got bad results as well, but with different params my results were better.
accelerate launch dreambooth_inpaint.py ^ --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting" ^ --instance_data_dir="./toy_cat" ^ --output_dir="./dreambooth_ad_inpaint_toy_cat" ^ --instance_prompt="toy cat" ^ --resolution=512 ^ --train_batch_size=1 ^ --learning_rate=5e-6 ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --gradient_accumulation_steps=2 ^ --gradient_checkpointing ^ --train_text_encoder
Training with those params this was my result.
maybe something with the 8bit_adam is not working as intended.
@thedarkzeno I was following the same settings you have mentioned but my loss is not decreasing. Any help would be appriciated.

Hello @kunalgoyal9, sometimes the loss doesn't decrease, but you can still get good results, did you check the outputs from your model?
@thedarkzeno Thanks for your reply... output is also not good.. I used four toy_cat images and testing using prompt "a toy cat sitting on a bench"
can you try using just "toy cat" as prompt?
Hello @Aldo-Aditiya
If there is a file named "config.json" under the cloned "stable-diffusion-inpainting" repository, the following error will occur.
ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D
I removed "config.json", and the error disappeared.
Hope this helps.
Hey folks! We're trying to encourage the forum for open ended discussion :) Might be good to make a thread there for future dreambooth inpainting discussion https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63
Hello @loboere I tryed with --use_8bit_adam and got bad results as well, but with different params my results were better.
accelerate launch dreambooth_inpaint.py ^ --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting" ^ --instance_data_dir="./toy_cat" ^ --output_dir="./dreambooth_ad_inpaint_toy_cat" ^ --instance_prompt="toy cat" ^ --resolution=512 ^ --train_batch_size=1 ^ --learning_rate=5e-6 ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --gradient_accumulation_steps=2 ^ --gradient_checkpointing ^ --train_text_encoder
Training with those params this was my result.
maybe something with the 8bit_adam is not working as intended.
Hi, I have tried as following your script and it works. However I would like to know if I have many pairs of images and their captions, how I can train the dataset correctly? You set the 'instance_prompt' just one string value "toy_cat", but I want to train many prompts. For example, "bed", "chair", "sofa", "wardrobe", ... for each image. Do you have any idea for it?
@JANGSOONMYUN you have to modify the code to support your data, I suggest you take a look at the text-to-image script here
@JANGSOONMYUN you have to modify the code to support your data, I suggest you take a look at the text-to-image script here
Ok thank you!