diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Dreambooth enabling xformers and set_grads_to_none raises unrecognized arguments error

Open wjx008 opened this issue 1 year ago • 1 comments

Describe the bug

using the train_dreambooth.py script, when I add flags for enabling xformers and set_grads_to_none, the following error happened: train_dreambooth.py: error: unrecognized arguments: --enable_xformers_memory_efficient_attention --set_grads_to_none

Reproduction

Followed the instructions in [https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth](dreambooth example readme) pip install git+https://github.com/ShivamShrirao/diffusers.git pip install -U -r requirements.txt and installed bitsandbytes with pip install bitsandbytes and installed xformer from source pip install ninja pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers Then accelerate config Then followed the steps for 12GB GPU, set all the variables and executed: `accelerate launch train_dreambooth.py \

--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--class_data_dir=$CLASS_DIR
--output_dir=$OUTPUT_DIR
--with_prior_preservation --prior_loss_weight=1.0
--instance_prompt="a photo of sks dog"
--class_prompt="a photo of dog"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1 --gradient_checkpointing
--use_8bit_adam
--enable_xformers_memory_efficient_attention
--set_grads_to_none
--learning_rate=2e-6
--lr_scheduler="constant"
--lr_warmup_steps=0
--num_class_images=200
--max_train_steps=800`

Logs

A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
usage: train_dreambooth.py [-h] --pretrained_model_name_or_path
                           PRETRAINED_MODEL_NAME_OR_PATH
                           [--pretrained_vae_name_or_path PRETRAINED_VAE_NAME_OR_PATH]
                           [--revision REVISION]
                           [--tokenizer_name TOKENIZER_NAME]
                           [--instance_data_dir INSTANCE_DATA_DIR]
                           [--class_data_dir CLASS_DATA_DIR]
                           [--instance_prompt INSTANCE_PROMPT]
                           [--class_prompt CLASS_PROMPT]
                           [--save_sample_prompt SAVE_SAMPLE_PROMPT]
                           [--save_sample_negative_prompt SAVE_SAMPLE_NEGATIVE_PROMPT]
                           [--n_save_sample N_SAVE_SAMPLE]
                           [--save_guidance_scale SAVE_GUIDANCE_SCALE]
                           [--save_infer_steps SAVE_INFER_STEPS] [--pad_tokens]
                           [--with_prior_preservation]
                           [--prior_loss_weight PRIOR_LOSS_WEIGHT]
                           [--num_class_images NUM_CLASS_IMAGES]
                           [--output_dir OUTPUT_DIR] [--seed SEED]
                           [--resolution RESOLUTION] [--center_crop]
                           [--train_text_encoder]
                           [--train_batch_size TRAIN_BATCH_SIZE]
                           [--sample_batch_size SAMPLE_BATCH_SIZE]
                           [--num_train_epochs NUM_TRAIN_EPOCHS]
                           [--max_train_steps MAX_TRAIN_STEPS]
                           [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
                           [--gradient_checkpointing]
                           [--learning_rate LEARNING_RATE] [--scale_lr]
                           [--lr_scheduler LR_SCHEDULER]
                           [--lr_warmup_steps LR_WARMUP_STEPS] [--use_8bit_adam]
                           [--adam_beta1 ADAM_BETA1] [--adam_beta2 ADAM_BETA2]
                           [--adam_weight_decay ADAM_WEIGHT_DECAY]
                           [--adam_epsilon ADAM_EPSILON]
                           [--max_grad_norm MAX_GRAD_NORM] [--push_to_hub]
                           [--hub_token HUB_TOKEN] [--hub_model_id HUB_MODEL_ID]
                           [--logging_dir LOGGING_DIR]
                           [--log_interval LOG_INTERVAL]
                           [--save_interval SAVE_INTERVAL]
                           [--save_min_steps SAVE_MIN_STEPS]
                           [--mixed_precision {no,fp16,bf16}]
                           [--not_cache_latents] [--hflip]
                           [--local_rank LOCAL_RANK]
                           [--concepts_list CONCEPTS_LIST]
                           [--read_prompts_from_txts]
train_dreambooth.py: error: unrecognized arguments: --enable_xformers_memory_efficient_attention --set_grads_to_none
╭────────────────────── Traceback (most recent call last) ───────────────────────╮
│ /home/test-gpu/anaconda3/envs/ldmclone/bin/accelerate:8 in <module>            │
│                                                                                │
│   5 from accelerate.commands.accelerate_cli import main                        │
│   6 if __name__ == '__main__':                                                 │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])       │
│ ❱ 8 │   sys.exit(main())                                                       │
│   9                                                                            │
│                                                                                │
│ /home/test-gpu/anaconda3/envs/ldmclone/lib/python3.10/site-packages/accelerate │
│ /commands/accelerate_cli.py:45 in main                                         │
│                                                                                │
│   42 │   │   exit(1)                                                           │
│   43 │                                                                         │
│   44 │   # Run                                                                 │
│ ❱ 45 │   args.func(args)                                                       │
│   46                                                                           │
│   47                                                                           │
│   48 if __name__ == "__main__":                                                │
│                                                                                │
│ /home/test-gpu/anaconda3/envs/ldmclone/lib/python3.10/site-packages/accelerate │
│ /commands/launch.py:918 in launch_command                                      │
│                                                                                │
│   915 │   elif defaults is not None and defaults.compute_environment == Comput │
│   916 │   │   sagemaker_launcher(defaults, args)                               │
│   917 │   else:                                                                │
│ ❱ 918 │   │   simple_launcher(args)                                            │
│   919                                                                          │
│   920                                                                          │
│   921 def main():                                                              │
│                                                                                │
│ /home/test-gpu/anaconda3/envs/ldmclone/lib/python3.10/site-packages/accelerate │
│ /commands/launch.py:580 in simple_launcher                                     │
│                                                                                │
│   577 │   process.wait()                                                       │
│   578 │   if process.returncode != 0:                                          │
│   579 │   │   if not args.quiet:                                               │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.retur │
│   581 │   │   else:                                                            │
│   582 │   │   │   sys.exit(1)                                                  │
│   583                                                                          │
╰────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['/home/test-gpu/anaconda3/envs/ldmclone/bin/python',
'train_dreambooth.py',
'--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4',
'--instance_data_dir=/home/test-gpu/code/training_data/test_1/data/training',
'--class_data_dir=/home/test-gpu/code/training_data/test_1/data/regularization',
'--output_dir=/home/test-gpu/code/model_checkpoints/test1',
'--with_prior_preservation', '--prior_loss_weight=1.0', '--instance_prompt=a photo
of sks man', '--class_prompt=a photo of man', '--resolution=512',
'--train_batch_size=1', '--gradient_accumulation_steps=1',
'--gradient_checkpointing', '--use_8bit_adam',
'--enable_xformers_memory_efficient_attention', '--set_grads_to_none',
'--learning_rate=2e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0',
'--num_class_images=200', '--max_train_steps=800']' returned non-zero exit status
2.

System Info

  • diffusers version: 0.15.0.dev0
  • Platform: Linux-4.15.0-196-generic-x86_64-with-glibc2.27
  • Python version: 3.10.4
  • PyTorch version (GPU?): 1.12.1+cu113 (True)
  • Huggingface_hub version: 0.14.1
  • Transformers version: 4.29.0
  • Accelerate version: 0.19.0
  • xFormers version: 0.0.20+6425fd0.d20230510
  • Using GPU in script?: yes (i assume this was set in accelerate config)
  • Using distributed or parallel set-up in script?: no

wjx008 avatar May 10 '23 19:05 wjx008

@wjx008 those arguments are not present in this fork of diffusers. Those arguments were added to a later version in the original huggingface/diffusers repo. The train_dreambooth.py script in this forked repo enables xformers automatically with the following code in train_dreambooth.py:

if is_xformers_available():
    pipeline.enable_xformers_memory_efficient_attention()

jag-ermeister avatar Aug 18 '23 00:08 jag-ermeister