diffusers
diffusers copied to clipboard
train_text_to_image_lora.py faces a problem when enable_xformers_memory_efficient_attention is True
Describe the bug
Hello, When I fine tune the model using this code and I enable enable_xformers_memory_efficient_attention, I face the following bug:
accelerator.backward(loss)
File "C:\Users\Environments\StableDiffusion\lib\site-packages\accelerate\accelerator.py", line 1316, in backward
loss.backward(**kwargs)
File "C:\Users\Environments\StableDiffusion\lib\site-packages\torch\_tensor.py", line 488, in backward
torch.autograd.backward(
File "C:\Users\Environments\StableDiffusion\lib\site-packages\torch\autograd\__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Has anyone fine-tuned the stable diffusion with both LoRa and Xformers?
Reproduction
I only ran this code while enable_xformers_memory_efficient_attention was True.
Logs
No response
System Info
diffusers
version: 0.12.1
- Platform: Windows-10-10.0.19045-SP0
- Python version: 3.8.10
- PyTorch version (GPU?): 1.13.1+cu116 (True)
- Huggingface_hub version: 0.12.0
- Transformers version: 0.15.0
- Accelerate version: not installed
- xFormers version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
xFormers 0.0.17.dev449 memory_efficient_attention.cutlassF: available memory_efficient_attention.cutlassB: available memory_efficient_attention.flshattF: available memory_efficient_attention.flshattB: available memory_efficient_attention.smallkF: available memory_efficient_attention.smallkB: available memory_efficient_attention.tritonflashattF: unavailable memory_efficient_attention.tritonflashattB: unavailable swiglu.fused.p.cpp: available is_triton_available: False is_functorch_available: False pytorch.version: 1.13.1+cu116 pytorch.cuda: available gpu.compute_capability: 8.6 gpu.name: NVIDIA GeForce RTX 3090 build.info: available build.cuda_version: 1107 build.python_version: 3.8.10 build.torch_version: 1.13.1+cu117 build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6 build.env.XFORMERS_BUILD_TYPE: Release build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: wheel-main source.privacy: open source
Could you install xformers
and accelerateand see if that resolves the issue? Running the efficient attention requires you to have
xformers` installed.
Could you install
xformers
and accelerateand see if that resolves the issue? Running the efficient attention requires you to have
xformers` installed.
Yes, during the inference, xformers accelerate the process significantly. The mentioned problem emerges in the fine-tuning process.
No, I meant do you have xformers
and accelerate
installed? Your system info shows:
![image](https://user-images.githubusercontent.com/22957388/220048106-64ce56fa-38b5-488a-8bfd-1ac8b1138f11.png)
Cc: @patrickvonplaten
It is similar to https://github.com/huggingface/diffusers/issues/2459, it should be fine with the new PR https://github.com/huggingface/diffusers/pull/2464
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.