fast-stable-diffusion SD v2 and Dreambooth, concept not working

Hi, I tried training using the v2 512px model with my face, training completed without issue, but when I use the model, with basic thing like "portrait of "my_concept_keyword" I have nothing looking like me. Everything work fine with the 1.5 version tho. Any tricks to do to have it working ? Thank you

Nov 30 '22 16:11 antoinerrr

I had a similar problem. I haven't found a solution yet. It seems that the output does not follow the orders of my training images.

Nov 30 '22 17:11 artificialguybr

the V2 is completely different from the V1, I'm at the moment working on finding the right learning rate for the V2 as the default one doesn't seem to be aggressive enough.

Nov 30 '22 18:11 TheLastBen

the V2 is completely different from the V1, I'm at the moment working on finding the right learning rate for the V2 as the default one doesn't seem to be aggressive enough.

Kaliyuga db v2 use this settings; i dont know if is good !accelerate launch train_dreambooth.py
--pretrained_model_name_or_path=$MODEL_NAME
--pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse"
--output_dir=$OUTPUT_DIR
--revision="fp16"
--with_prior_preservation --prior_loss_weight=1.0
--seed=1337
--resolution=512
--train_batch_size=1
--train_text_encoder
--mixed_precision="fp16"
--use_8bit_adam
--gradient_accumulation_steps=1
--gradient_checkpointing
--learning_rate=4e-6
--lr_scheduler="constant"
--lr_warmup_steps=0
--num_class_images=50
--sample_batch_size=4
--max_train_steps=5000
--save_interval=500
--save_sample_prompt="a very good dog, art by [yourname]"
--concepts_list="concepts_list.json"

Nov 30 '22 18:11 artificialguybr

the V2 is completely different from the V1, I'm at the moment working on finding the right learning rate for the V2 as the default one doesn't seem to be aggressive enough.

From what other users of other repos are reporting, the problem is in the text-encoder and not in the learning rate. Some report that perhaps the best is to completely remove the text-encoder, while others report that the ideal is to set it to 100%

I believe that any of these two is the solution we are looking for.

Dec 01 '22 02:12 artificialguybr

captioning the images helps a lot, I'm still experimenting

Dec 01 '22 12:12 TheLastBen

@TheLastBen I just tested with the text-encoder to 0% and the model started following the training images. Maybe the problem was really that. In my previous test was at 15% and was not replicating.

Dec 03 '22 01:12 artificialguybr

I'm adding a new conversion script that will fix the problem

Dec 03 '22 11:12 TheLastBen