kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

No module named 'xformers' on AMD rx7800XT [fedora40]

Open TomTheDragon opened this issue 7 months ago • 0 comments

Hey there, i have a little problem and i am wondering if there is just maybe missing in my settings or if there is something wrong with the dependencies.

My GPU is detected fine when i start the UI

15:45:13-954607 INFO     Kohya_ss GUI version: v24.1.4                                                                                   
15:45:13-986823 INFO     Submodule initialized and updated.                                                                              
15:45:13-987814 INFO     AMD toolkit detected                                                                                            
15:45:16-129105 INFO     Torch 2.3.0+rocm6.0                                                                                             
15:45:16-130098 INFO     Torch backend: AMD ROCm HIP 6.0.32830-d62f6a171                                                                 
15:45:16-131366 INFO     Torch detected GPU: AMD Radeon RX 7800 XT VRAM 16368 Arch (11, 0) Cores 30                                      
15:45:16-133005 INFO     Python version is 3.10.14 (main, Apr 17 2024, 00:00:00) [GCC 14.0.1 20240411 (Red Hat 14.0.1-0)]                
15:45:16-133986 INFO     Verifying modules installation status from /home/tom/source/kohya_ss/requirements_linux_rocm.txt...             
15:45:16-135837 INFO     Verifying modules installation status from requirements.txt...

But everytime i try training a lora, i get the following error:

INFO     UNet2DConditionModel: 64, 8, 768, False, False                                         original_unet.py:1387
2024-07-24 15:22:46 INFO     loading u-net: <All keys matched successfully>                                            model_util.py:1009
2024-07-24 15:22:47 INFO     loading vae: <All keys matched successfully>                                              model_util.py:1017
2024-07-24 15:22:48 INFO     loading text encoder: <All keys matched successfully>                                     model_util.py:1074
                    INFO     Enable xformers for U-Net                                                                 train_util.py:2660
Traceback (most recent call last):
  File "/home/tom/source/kohya_ss/sd-scripts/library/train_util.py", line 2662, in replace_unet_modules
    import xformers.ops
ModuleNotFoundError: No module named 'xformers'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tom/source/kohya_ss/sd-scripts/train_network.py", line 1115, in <module>
    trainer.train(args)
  File "/home/tom/source/kohya_ss/sd-scripts/train_network.py", line 240, in train
    train_util.replace_unet_modules(unet, args.mem_eff_attn, args.xformers, args.sdpa)
  File "/home/tom/source/kohya_ss/sd-scripts/library/train_util.py", line 2664, in replace_unet_modules
    raise ImportError("No xformers / xformersがインストールされていないようです")
ImportError: No xformers / xformersがインストールされていないようです
Traceback (most recent call last):
  File "/home/tom/source/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/tom/source/kohya_ss/venv/lib64/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/home/tom/source/kohya_ss/venv/lib64/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
    simple_launcher(args)
  File "/home/tom/source/kohya_ss/venv/lib64/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/tom/source/kohya_ss/venv/bin/python', '/home/tom/source/kohya_ss/sd-scripts/train_network.py', '--config_file', '/home/tom/Nextcloud/lora_projekte/zorgoias/model/config_lora-20240724-152232.toml']' returned non-zero exit status 1.
15:22:50-575202 INFO     Training has ended.

As far as i know, it should not even try to use xformers on AMD GPUs and i run everything with the "--use-rocm" switch. I already tried changing the rocm version to 5.7 but no change. I do also have a stable-diffusion installation on the same machine, which uses rocm5.7 and it runs really well, so i dont think thats the problem.

TomTheDragon avatar Jul 24 '24 13:07 TomTheDragon