diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

train_text_to_image_lora.py faces a problem when enable_xformers_memory_efficient_attention is True

Open FBehrad opened this issue 1 year ago • 3 comments

Describe the bug

Hello, When I fine tune the model using this code and I enable enable_xformers_memory_efficient_attention, I face the following bug:

    accelerator.backward(loss)
  File "C:\Users\Environments\StableDiffusion\lib\site-packages\accelerate\accelerator.py", line 1316, in backward
    loss.backward(**kwargs)
  File "C:\Users\Environments\StableDiffusion\lib\site-packages\torch\_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "C:\Users\Environments\StableDiffusion\lib\site-packages\torch\autograd\__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Has anyone fine-tuned the stable diffusion with both LoRa and Xformers?

Reproduction

I only ran this code while enable_xformers_memory_efficient_attention was True.

Logs

No response

System Info

diffusers version: 0.12.1

  • Platform: Windows-10-10.0.19045-SP0
  • Python version: 3.8.10
  • PyTorch version (GPU?): 1.13.1+cu116 (True)
  • Huggingface_hub version: 0.12.0
  • Transformers version: 0.15.0
  • Accelerate version: not installed
  • xFormers version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

xFormers 0.0.17.dev449 memory_efficient_attention.cutlassF: available memory_efficient_attention.cutlassB: available memory_efficient_attention.flshattF: available memory_efficient_attention.flshattB: available memory_efficient_attention.smallkF: available memory_efficient_attention.smallkB: available memory_efficient_attention.tritonflashattF: unavailable memory_efficient_attention.tritonflashattB: unavailable swiglu.fused.p.cpp: available is_triton_available: False is_functorch_available: False pytorch.version: 1.13.1+cu116 pytorch.cuda: available gpu.compute_capability: 8.6 gpu.name: NVIDIA GeForce RTX 3090 build.info: available build.cuda_version: 1107 build.python_version: 3.8.10 build.torch_version: 1.13.1+cu117 build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6 build.env.XFORMERS_BUILD_TYPE: Release build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: wheel-main source.privacy: open source

FBehrad avatar Feb 19 '23 12:02 FBehrad

Could you install xformers and accelerateand see if that resolves the issue? Running the efficient attention requires you to havexformers` installed.

sayakpaul avatar Feb 20 '23 07:02 sayakpaul

Could you install xformers and accelerateand see if that resolves the issue? Running the efficient attention requires you to havexformers` installed.

Yes, during the inference, xformers accelerate the process significantly. The mentioned problem emerges in the fine-tuning process.

FBehrad avatar Feb 20 '23 08:02 FBehrad

No, I meant do you have xformers and accelerate installed? Your system info shows:

image

Cc: @patrickvonplaten

sayakpaul avatar Feb 20 '23 08:02 sayakpaul

It is similar to https://github.com/huggingface/diffusers/issues/2459, it should be fine with the new PR https://github.com/huggingface/diffusers/pull/2464

haofanwang avatar Feb 22 '23 18:02 haofanwang

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Mar 21 '23 15:03 github-actions[bot]