diffusers
diffusers copied to clipboard
Training Lora Fails after latest update
Describe the bug
After updating to latest the script to train a lora fails, I'll attach the error below.
Reproduction
accelerate launch train_dreambooth_lora_sdxl_advanced.py
--pretrained_model_name_or_path=$MODEL_NAME
--pretrained_vae_model_name_or_path=$VAE_PATH
--dataset_name='boby-set'
--instance_prompt="photo of a TOK dog"
--validation_prompt="a TOK dog in illustration style full body" --output_dir='boby-sdxl-lora-650'
--caption_column="prompt"
--mixed_precision="bf16"
--resolution=1024
--train_batch_size=1
--repeats=1
--gradient_accumulation_steps=1
--gradient_checkpointing
--learning_rate=1.0
--text_encoder_lr=1.0
--optimizer="prodigy"
--train_text_encoder_ti
--train_text_encoder_ti_frac=0.5
--snr_gamma=5.0
--lr_scheduler="costant"
--lr_warmup_steps=0
--rank=32
--max_train_steps=650
--checkpointing_steps=2000
--seed="0"
Logs
03/15/2024 19:09:30 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: bf16
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'dynamic_thresholding_ratio', 'variance_type', 'thresholding', 'clip_sample_range', 'rescale_betas_zero_snr'} was not found in config. Values will be initialized to default values.
Traceback (most recent call last):
File "/home/pedro/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py", line 2366, in <module>
main(args)
File "/home/pedro/projects/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py", line 1275, in main
text_encoder_one = text_encoder_cls_one.from_pretrained(
File "/home/pedro/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2362, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: CLIPTextModel.__init__() got an unexpected keyword argument 'variant'
Traceback (most recent call last):
File "/home/pedro/.local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/pedro/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/home/pedro/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
simple_launcher(args)
File "/home/pedro/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth_lora_sdxl_advanced.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix', '--dataset_name=boby-set', '--instance_prompt=photo of a TOK dog', '--validation_prompt=a TOK dog in illustration style full body', '--output_dir=boby-sdxl-lora-650', '--caption_column=prompt', '--mixed_precision=bf16', '--resolution=1024', '--train_batch_size=1', '--repeats=1', '--gradient_accumulation_steps=1', '--gradient_checkpointing', '--learning_rate=1.0', '--text_encoder_lr=1.0', '--optimizer=prodigy', '--train_text_encoder_ti', '--train_text_encoder_ti_frac=0.5', '--snr_gamma=5.0', '--lr_scheduler=costant', '--lr_warmup_steps=0', '--rank=32', '--max_train_steps=650', '--checkpointing_steps=2000', '--seed=0']' returned non-zero exit status 1.
System Info
main branch, peft main branch.
Who can help?
@sayakpaul
Cc: @linoytsaban
Hey @pedropaf! could you please specify which base model and vae you're using?
(i.e. --pretrained_model_name_or_path=$MODEL_NAME
--pretrained_vae_model_name_or_path=$VAE_PATH
)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
hi @pedropaf
is this still an issue? if so, can you provide the additional info @linoytsaban asked?
thanks!
Sorry @linoytsaban @yiyixuxu I didn't look at this in a while, I did use the base model and VAE:
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export VAE_PATH="madebyollin/sdxl-vae-fp16-fix"
I can see there are some updates since I last looked at this so I'll give it a try and see if I still get the issue or not.
I have tried with the latest main branch and I don't get the error, so I'll close the issue.