stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: Segmentation fault (core dumped) with docker pytorch and gfx803
Checklist
- [X] The issue exists after disabling all extensions
- [X] The issue exists on a clean installation of webui
- [X] The issue is caused by an extension, but I believe it is caused by a bug in the webui
- [X] The issue exists in the current version of the webui
- [X] The issue has not been reported before recently
- [X] The issue has been reported before but has not been fixed yet
What happened?
i heard gfx803 should work with docker amd images i followed these instructions https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs#install-on-amd-and-arch-linux and i fixed all the errors along the installation but i got unknown segmentation core error
Steps to reproduce the problem
- https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs#install-on-amd-and-arch-linux follow those instructions different only is i tried launching with .webui.sh but added those commandline argumentus into webui-user.sh
- export HSA_OVERRIDE_GFX_VERSION=10.3.0
- had to fix this error and use this fix https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/11458#issuecomment-1611586137
- segmentation faul
What should have happened?
it should have launched i had everything all drivers working, when i type rocminfo i get gfx803 output inside and outside the docker image
rocminfo | grep gfx Name: gfx803 Name: amdgcn-amd-amdhsa--gfx803
What browsers do you use to access the UI ?
No response
Sysinfo
host is Gentoo and using ryzen 5600X with RX570 GPU and 16GB of ram
Console logs
had this
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing HIP_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Stable diffusion model failed to load
Applying attention optimization: Doggettx... done.
rocBLAS error: Cannot read /opt/rocm/lib/rocblas/library/TensileLibrary.dat: No such file or directory for GPU arch : gfx803
List of available TensileLibrary Files :
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx941.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx940.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat"
"/opt/rocm/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat"
Aborted (core dumped)
but i fixed it with export HSA_OVERRIDE_GFX_VERSION=10.3.0
after that
REQS_FILE='requirements.txt' python launch.py --precision full --no-half --skip-torch-cuda-test
Python 3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Launching Web UI with arguments: --precision full --no-half --skip-torch-cuda-test
Segmentation fault (core dumped)
root@gentoo:/dockerx/stable-diffusion-webui# REQS_FILE='requirements.txt' python launch.py --precision full --no-half --skip-torch-cuda-test
Python 3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Launching Web UI with arguments: --precision full --no-half --skip-torch-cuda-test
Segmentation fault (core dumped)
Additional information
No response
Ive got the same issue in a different context
HSA_OVERRIDE_GFX_VERSION=10.3.0 works well for Navi (rx 5000 and 6000 series), but your gpu is older. Try with HSA_OVERRIDE_GFX_VERSION=9.0.0