stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: AMD + Pytorch 2.0 -> miopen::Exception

Open Chilluminati91 opened this issue 1 year ago • 13 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

When using Pytorch 2.0 and generating an image Automatic1111 immediately crashes. Same issue on other forks.

i5-12600K 32GB DDR4 AMD RX 6650XT

Error Code:

MIOpen(HIP): Error [Compile] 'hiprtcCompileProgram(prog.get(), c_options.size(), c_options.data())' convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp: HIPRTC_ERROR_COMPILATION (6)
MIOpen(HIP): Error [BuildHip] HIPRTC status = HIPRTC_ERROR_COMPILATION (6), source file: convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp
MIOpen(HIP): Warning [BuildHip] In file included from /tmp/comgr-39ed63/input/convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp:1:
In file included from /tmp/comgr-39ed63/include/common_header.hpp:10:
/tmp/comgr-39ed63/include/data_type.hpp:14:10: fatal error: 'limits' file not found
#include <limits> // std::numeric_limits
         ^~~~~~~~
1 error generated when compiling for gfx1030.
terminate called after throwing an instance of 'miopen::Exception'
  what():  /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/MLOpen/src/hipoc/hipoc_program.cpp:304: Code object build failed. Source: convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp
Aborted (core dumped)

Steps to reproduce the problem

  1. Manually change webui-user.sh to upgrade to Torch 2.0
  2. Delete venv and let it rebuild
  3. Run SD, generate an image and get the error above

What should have happened?

Proper image generation

Commit where the problem happens

22bcc7be428c94e9408f589966c2040187245d81

What platforms do you use to access the UI ?

Linux

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

export COMMANDLINE_ARGS="--autolaunch --opt-sdp-attention"
export TORCH_COMMAND="pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2"
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128

List of extensions

No

Console logs

liam@liam-ubuntu:~/stable-diffusion-webui$ bash webui.sh

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on liam user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
Commit hash: 22bcc7be428c94e9408f589966c2040187245d81
Installing torch and torchvision
Looking in indexes: https://download.pytorch.org/whl/rocm5.4.2
Collecting torch
  Using cached https://download.pytorch.org/whl/rocm5.4.2/torch-2.0.0%2Brocm5.4.2-cp310-cp310-linux_x86_64.whl (1536.4 MB)
Collecting torchvision
  Using cached https://download.pytorch.org/whl/rocm5.4.2/torchvision-0.15.1%2Brocm5.4.2-cp310-cp310-linux_x86_64.whl (62.4 MB)
Collecting torchaudio
  Using cached https://download.pytorch.org/whl/rocm5.4.2/torchaudio-2.0.1%2Brocm5.4.2-cp310-cp310-linux_x86_64.whl (4.1 MB)
Collecting pytorch-triton-rocm<2.1,>=2.0.0
  Using cached https://download.pytorch.org/whl/pytorch_triton_rocm-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (78.4 MB)
Collecting jinja2
  Using cached https://download.pytorch.org/whl/Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting networkx
  Using cached https://download.pytorch.org/whl/networkx-3.0-py3-none-any.whl (2.0 MB)
Collecting sympy
  Using cached https://download.pytorch.org/whl/sympy-1.11.1-py3-none-any.whl (6.5 MB)
Collecting typing-extensions
  Using cached https://download.pytorch.org/whl/typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting filelock
  Using cached https://download.pytorch.org/whl/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Collecting pillow!=8.3.*,>=5.3.0
  Using cached https://download.pytorch.org/whl/Pillow-9.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
Collecting numpy
  Using cached https://download.pytorch.org/whl/numpy-1.24.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Collecting requests
  Using cached https://download.pytorch.org/whl/requests-2.28.1-py3-none-any.whl (62 kB)
Collecting cmake
  Using cached https://download.pytorch.org/whl/cmake-3.25.0-py2.py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.7 MB)
Collecting lit
  Using cached lit-15.0.7-py3-none-any.whl
Collecting MarkupSafe>=2.0
  Using cached https://download.pytorch.org/whl/MarkupSafe-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting idna<4,>=2.5
  Using cached https://download.pytorch.org/whl/idna-3.4-py3-none-any.whl (61 kB)
Collecting charset-normalizer<3,>=2
  Using cached https://download.pytorch.org/whl/charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
Collecting certifi>=2017.4.17
  Using cached https://download.pytorch.org/whl/certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached https://download.pytorch.org/whl/urllib3-1.26.13-py2.py3-none-any.whl (140 kB)
Collecting mpmath>=0.19
  Using cached https://download.pytorch.org/whl/mpmath-1.2.1-py3-none-any.whl (532 kB)
Installing collected packages: mpmath, lit, cmake, urllib3, typing-extensions, sympy, pillow, numpy, networkx, MarkupSafe, idna, filelock, charset-normalizer, certifi, requests, jinja2, pytorch-triton-rocm, torch, torchvision, torchaudio
Successfully installed MarkupSafe-2.1.2 certifi-2022.12.7 charset-normalizer-2.1.1 cmake-3.25.0 filelock-3.9.0 idna-3.4 jinja2-3.1.2 lit-15.0.7 mpmath-1.2.1 networkx-3.0 numpy-1.24.1 pillow-9.3.0 pytorch-triton-rocm-2.0.1 requests-2.28.1 sympy-1.11.1 torch-2.0.0+rocm5.4.2 torchaudio-2.0.1+rocm5.4.2 torchvision-0.15.1+rocm5.4.2 typing-extensions-4.4.0 urllib3-1.26.13
Installing gfpgan
Installing clip
Installing open_clip
Cloning Stable Diffusion into /home/liam/stable-diffusion-webui/repositories/stable-diffusion-stability-ai...
Cloning Taming Transformers into /home/liam/stable-diffusion-webui/repositories/taming-transformers...
Cloning K-diffusion into /home/liam/stable-diffusion-webui/repositories/k-diffusion...
Cloning CodeFormer into /home/liam/stable-diffusion-webui/repositories/CodeFormer...
Cloning BLIP into /home/liam/stable-diffusion-webui/repositories/BLIP...
Installing requirements for CodeFormer
Installing requirements for Web UI
Launching Web UI with arguments: --autolaunch --opt-sdp-attention
No module 'xformers'. Proceeding without it.
/home/liam/stable-diffusion-webui/venv/lib/python3.10/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Calculating sha256 for /home/liam/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors: 6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa
Loading weights [6ce0161689] from /home/liam/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Creating model from config: /home/liam/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying scaled dot product cross attention optimization.
Textual inversion embeddings loaded(0): 
Model loaded in 14.6s (calculate hash: 9.9s, load weights from disk: 0.2s, create model: 0.5s, apply weights to model: 0.5s, apply half(): 0.4s, load VAE: 2.8s, move model to device: 0.3s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 19.6s (import torch: 0.8s, import gradio: 0.7s, import ldm: 1.0s, other imports: 0.7s, load scripts: 0.4s, load SD checkpoint: 14.8s, create ui: 0.3s, gradio launch: 0.7s).
  0%|                                                    | 0/20 [00:00<?, ?it/s]MIOpen(HIP): Error [Compile] 'hiprtcCompileProgram(prog.get(), c_options.size(), c_options.data())' convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp: HIPRTC_ERROR_COMPILATION (6)
MIOpen(HIP): Error [BuildHip] HIPRTC status = HIPRTC_ERROR_COMPILATION (6), source file: convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp
MIOpen(HIP): Warning [BuildHip] In file included from /tmp/comgr-39ed63/input/convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp:1:
In file included from /tmp/comgr-39ed63/include/common_header.hpp:10:
/tmp/comgr-39ed63/include/data_type.hpp:14:10: fatal error: 'limits' file not found
#include <limits> // std::numeric_limits
         ^~~~~~~~
1 error generated when compiling for gfx1030.
terminate called after throwing an instance of 'miopen::Exception'
  what():  /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/MLOpen/src/hipoc/hipoc_program.cpp:304: Code object build failed. Source: convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp
Aborted (core dumped)

Additional information

No response

Chilluminati91 avatar Apr 22 '23 09:04 Chilluminati91

Can be solved by following this post.

Chilluminati91 avatar Apr 22 '23 13:04 Chilluminati91

Can be solved by following this post.

Did not (fully) solve for me (I'm on NixOS.. it's rough sometimes).

Issue also appears to be related to these ROCm/MIOpen issues that seem to have been recently patched on the dev branch (no idea when widely available though):

https://github.com/ROCmSoftwarePlatform/MIOpen/issues/2096#issuecomment-1511519936 https://github.com/ROCmSoftwarePlatform/MIOpen/pull/2050

https://github.com/ROCmSoftwarePlatform/MIOpen/issues/1921

wolfsprite avatar May 19 '23 11:05 wolfsprite

@wolfsprite I'm also running NixOS, and I've spent nearly a week trying to fix this. I didn't find any information on getting libstdc++-12-dev installed on NixOS, but I found a different way that works. libcxxStdenv provides whatever is needed, and I'm able to generate images.

What I'm currently doing is using a nix-shell to create the environment needed to run AI things:

{ pkgs ? import <nixpkgs> {} }:
with pkgs;
mkShell rec {
  name = "sd-env";
  buildInputs = [
        git # The program instantly crashes if git is not present, even if everything is already downloaded
        python310
        python310Packages.pip
        conda
        stdenv.cc.cc.lib
        stdenv.cc
        ncurses5
        binutils
        gitRepo
        gnupg
        autoconf
        curl
        procps
        gnumake
        util-linux
        m4
        gperf
        unzip
        libGLU
        libGL
        glib
        rocminfo
        rocm-smi
        rocm-runtime
        rocm-core
        rocm-device-libs
        rocm-cmake
        rocm-opencl-icd
        rocm-opencl-runtime
        rocblas
        libcxxStdenv ];
  LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath buildInputs;
}

Save it as shell.nix, put it in your home folder, and then run nix-shell. This will put you into an environment with the needed dependencies, and then you can just run the programs as usual.

There are probably a lot of packages that aren't needed, but these work, and I'm too lazy to trim them.

HiroseKoichi avatar Jul 19 '23 05:07 HiroseKoichi

@wolfsprite I'm also running NixOS, and I've spent nearly a week trying to fix this. I didn't find any information on getting libstdc++-12-dev installed on NixOS, but I found a different way that works. libcxxStdenv provides whatever is needed, and I'm able to generate images.

What I'm currently doing is using a nix-shell to create the environment needed to run AI things:

{ pkgs ? import <nixpkgs> {} }:
with pkgs;
mkShell rec {
  name = "sd-env";
  buildInputs = [
        git # The program instantly crashes if git is not present, even if everything is already downloaded
        python310
        python310Packages.pip
        conda
        stdenv.cc.cc.lib
        stdenv.cc
        ncurses5
        binutils
        gitRepo
        gnupg
        autoconf
        curl
        procps
        gnumake
        util-linux
        m4
        gperf
        unzip
        libGLU
        libGL
        glib
        rocminfo
        rocm-smi
        rocm-runtime
        rocm-core
        rocm-device-libs
        rocm-cmake
        rocm-opencl-icd
        rocm-opencl-runtime
        rocblas
        libcxxStdenv ];
  LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath buildInputs;
}

Save it as shell.nix, put it in your home folder, and then run nix-shell. This will put you into an environment with the needed dependencies, and then you can just run the programs as usual.

There are probably a lot of packages that aren't needed, but these work, and I'm too lazy to trim them.

Which channel/commit are you using where this works? I still get the aforementioned issue with this expression.

meutraa avatar Aug 12 '23 16:08 meutraa

23.05 It was working consistently until a few days after I posted about it, but now It gives the same error again. I also tried using pytorch nightly and got a different version of the same error. I've stopped trying to get nixos to work with this and just swapped to docker/podman containers instead.

HiroseKoichi avatar Aug 13 '23 10:08 HiroseKoichi

@meutraa I got it to work. This issue has already been fixed in PyTorch nightly and will be included in the next stable release. The reason it didn't work for me before was because you have to delete ~/.cache/miopen first. So for now just use nightly until the next stable release.

So basically you just use: pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.6 instead of: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2 and delete ~/.cache/miopen

This is the shell I'm using. Channel: 23.05

{ pkgs ? import <nixpkgs> {} }:
with pkgs;
mkShell rec {
  name = "sd-env";
  buildInputs = [
        gcc-unwrapped
        git
        curl
        gnumake
        util-linux
        binutils
        unzip
        libGLU
        libGL
        glib
        zlib
        python310
        python310Packages.pip
        virtualenv
        rocminfo
        rocm-smi
        rocm-runtime
        rocm-device-libs
        rocm-opencl-runtime ];
  LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath buildInputs;
}

HiroseKoichi avatar Aug 22 '23 00:08 HiroseKoichi

Yeah I did notice it started working about last week.

meutraa avatar Aug 22 '23 05:08 meutraa

I tried with rocm 5.6 pre and the env vars and I get:

docker-compose up stablediff-rocm 
Starting stablediff-rocm-runner ... done
Attaching to stablediff-rocm-runner
stablediff-rocm-runner | 
stablediff-rocm-runner | 
stablediff-rocm-runner | ========================= ROCm System Management Interface =========================
stablediff-rocm-runner | =================================== Concise Info ===================================
stablediff-rocm-runner | ====================================================================================
stablediff-rocm-runner | ERROR: GPU[0]	: sclk clock is unsupported
stablediff-rocm-runner | GPU[0]		: get_power_cap, Not supported on the given system
stablediff-rocm-runner | GPU  Temp (DieEdge)  AvgPwr   SCLK  MCLK     Fan  Perf  PwrCap       VRAM%  GPU%  
stablediff-rocm-runner | 0    46.0c           10.069W  None  2600Mhz  0%   auto  Unsupported    3%   0%    
stablediff-rocm-runner | ====================================================================================
stablediff-rocm-runner | =============================== End of ROCm SMI Log ================================
stablediff-rocm-runner | launch.py --listen --no-half --skip-torch-cuda-test
stablediff-rocm-runner | Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0]
stablediff-rocm-runner | Version: v1.5.1
stablediff-rocm-runner | Commit hash: 68f336bd994bed5442ad95bad6b6ad5564a5409a
stablediff-rocm-runner | Launching Web UI with arguments: --listen --no-half --skip-torch-cuda-test
stablediff-rocm-runner | no module 'xformers'. Processing without...
stablediff-rocm-runner | no module 'xformers'. Processing without...
stablediff-rocm-runner | No module 'xformers'. Proceeding without it.
stablediff-rocm-runner | Loading weights [6ce0161689] from /stablediff-web/models/Stable-diffusion/Stable-diffusion/v1-5-pruned-emaonly.safetensors
stablediff-rocm-runner | Creating model from config: /stablediff-web/configs/v1-inference.yaml
stablediff-rocm-runner | LatentDiffusion: Running in eps-prediction mode
stablediff-rocm-runner | Segmentation fault (core dumped)
stablediff-rocm-runner exited with code 139

PS: it worked with pytorch 1.x

grigio avatar Aug 28 '23 10:08 grigio

Does anyone in this thread have a currently working nix-shell or flake to get started on? Many rocm packages appear to have been reorganized on NixOS unstable. Maybe @HiroseKoichi or @meutraa ?

ic4-y avatar Nov 02 '23 11:11 ic4-y

Does anyone in this thread have a currently working nix-shell or flake to get started on? Many rocm packages appear to have been reorganized on NixOS unstable. Maybe @meutraa ?

let
  pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-unstable.tar.gz") {};
in
  pkgs.mkShell rec {
    buildInputs = with pkgs; [
      python39
      git
      python39Packages.pip
      #      vulkan-headers
      #      vulkan-loader
      #      vulkan-tools
      #      libdrm
      #      libglvnd

      zlib
      stdenv.cc.cc.lib
      stdenv.cc
      ncurses5
      binutils
      gitRepo
      gnupg
      autoconf
      curl
      procps
      gnumake
      util-linux
      m4
      gperf
      unzip
      libGLU
      libGL
      glib
      #      rocminfo
      #      rocm-smi
      #      rocm-runtime
      #      rocm-core
      #      rocm-device-libs
      #      rocm-cmake
      #      rocm-opencl-icd
      #      rocm-opencl-runtime
      #      rocblas
      libcxxStdenv
      llvmPackages.openmp
    ];
    shellHook = ''
      #            rm -rf .venv
                  python3.9 -m venv .venv/
                  source .venv/bin/activate
                  export PYTORCH_ROCM_ARCH="gfx1100"
                  export HSA_OVERRIDE_GFX_VERSION=11.0.0
                  export TORCH_COMMAND='pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/rocm5.7/'
                  export USE_VULKAN=1 USE_VULKAN_SHADERC_RUNTIME=1 USE_VULKAN_WRAPPER=0
                  export AMD_VULKAN_ICD=RADV VK_ICD_FILENAMES=${pkgs.amdvlk}/share/vulkan/icd.d/amd_icd64.json
                  bash -c "TMPDIR=/tmp $TORCH_COMMAND"
    '';
    LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath buildInputs;
  }

I did play around with the rocm-* packages but at the time they were not 5.7 and didn't support my gpu.

meutraa avatar Nov 05 '23 11:11 meutraa

Thanks for the reply! I went full docker in the meantime and got it up and running using the AMD official rocm/pytorch container. It would be nice though to have it working on NixOS, too.

However in your shell you commented out all the ROCm related packages? So how would that provide a shell to run ROCm?

Judging from the USE_VULKAN=1 env variable you choose to use the Vulkan backend for pytorch instead?

ic4-y avatar Nov 05 '23 17:11 ic4-y

I just left in a bunch of stuff I was experimenting with over time. This shell creates a venv which is dirty, you just run webui.sh in the shell and it uses rocm not vulkan, at least for gfx1100 which I use.

On Sun, 5 Nov 2023, at 5:35 PM, icodeforyou.net wrote:

Thanks for the reply! I went full docker in the meantime and got it up and running using the AMD official rocm/pytorch container. It would be nice though to have it working on NixOS, too.

However in your shell you commented out all the ROCm related packages? So how would that provide a shell to run ROCm?

Judging from the USE_VULKAN=1 env variable you choose to use the Vulkan backend for pytorch instead?

— Reply to this email directly, view it on GitHub https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/9795#issuecomment-1793799330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUVUMWDXAKL5YDAXESDCDC3YC7E6HAVCNFSM6AAAAAAXHW7YKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTG44TSMZTGA. You are receiving this because you were mentioned.Message ID: @.***>

meutraa avatar Nov 05 '23 17:11 meutraa

Does anyone in this thread have a currently working nix-shell or flake to get started on? Many rocm packages appear to have been reorganized on NixOS unstable. Maybe @HiroseKoichi or @meutraa ?

I completely forgot that I even posted on this issue. If you haven't already figured something out, this in the shell.nix that I use for AI stuff on Nixos 23.11:

{ pkgs ? import <nixpkgs> {} }:
with pkgs; mkShell rec {
  buildInputs = [
        # Basic Utilities
        gcc-unwrapped
        git
        ninja
        ncurses
        curl
        gnumake
        cmake
        util-linux
        binutils
        p7zip
        libGLU
        libGL
        glib
        zlib

        # Python
        python3
        python311Packages.pip
        python311Packages.wheel
        conda

        # Rocm
        rocmPackages.llvm.openmp
        rocmPackages.rocm-core
        rocmPackages.clr
        rocmPackages.rccl
        rocmPackages.miopen
        rocmPackages.miopengemm
        rocmPackages.rocrand
        rocmPackages.rocblas
        rocmPackages.rocsparse
        rocmPackages.hipsparse
        rocmPackages.rocthrust
        rocmPackages.rocprim
        rocmPackages.hipcub
        rocmPackages.roctracer
        rocmPackages.rocfft
        rocmPackages.rocsolver
        rocmPackages.hipfft
        rocmPackages.hipsolver
        rocmPackages.hipblas
        rocmPackages.rocblas
        rocmPackages.rocminfo
        rocmPackages.rocm-smi
        rocmPackages.rocm-cmake
        rocmPackages.rocm-thunk
        rocmPackages.rocm-comgr
        rocmPackages.rocm-device-libs
        rocmPackages.rocm-runtime
        rocmPackages.hipify ];
  LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath buildInputs;
}

HiroseKoichi avatar Mar 03 '24 18:03 HiroseKoichi