stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

AMD GPU not supported?

Open tankhead200 opened this issue 2 years ago • 30 comments

When running SD I get runtime errors that no Nvidia GPU or driver's installed on your system. Workaround for AMD owners? Or unsupported?

tankhead200 avatar Aug 21 '22 21:08 tankhead200

i also have this issue if anyone knows how to get around it

macery12 avatar Aug 21 '22 22:08 macery12

Instead of installing how the repo says, try installing the ROCM version from PyTorch website using pip. It may still not work but there is a chance it does. https://pytorch.org/get-started/locally/

mallorbc avatar Aug 22 '22 00:08 mallorbc

r_Sh4d0w linked to this on the discord https://rentry.org/tqizb

It's a set of instructions for running stable diffusion with an AMD GPU.

recurrence avatar Aug 22 '22 15:08 recurrence

I'm able to run the public release just fine on a RADEON 6800XT using a ubuntu docker container, ROCM 5.1.1 and torch-1.12.1

gururise avatar Aug 22 '22 21:08 gururise

can someone give me the docker file? @gururise

JeremiasGiglio avatar Aug 22 '22 23:08 JeremiasGiglio

Let me know, if you have further issues. Works on linux, not windows.

# AMD Driver installation: https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.1/page/How_to_Install_ROCm.html
# command would be something like this after installing amdgpu-install
# sudo amdgpu-install --rocmrelease=5.2.3 --usecase=dkms,graphics,rocm,lrt,hip,hiplibsdk
# if its installed already, try rocm-smi command it will show available GPUs

cd stable-diffusion/
conda env create -f environment.yaml
conda activate ldm
conda remove cudatoolkit -y
pip3 uninstall torch torchvision -y 
# Install PyTorch ROCm
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.1.1
pip3 install transformers==4.19.2 scann kornia==0.6.4 torchmetrics==0.6.0

# Place the model as model.ckpt in the models/ldm/stable-diffusion-v1/ folder
python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms

harishanand95 avatar Aug 23 '22 16:08 harishanand95

or just use this https://huggingface.co/spaces/stabilityai/stable-diffusion

breadbrowser avatar Aug 23 '22 19:08 breadbrowser

I've created a detailed tutorial on how I got stable diffusion working on my AMD 6800XT GPU.

gururise avatar Aug 24 '22 00:08 gururise

I've created a detailed tutorial on how I got stable diffusion working on my AMD 6800XT GPU.

cool

breadbrowser avatar Aug 24 '22 00:08 breadbrowser

Thanks! @gururise

harishanand95 avatar Aug 24 '22 00:08 harishanand95

Does anyone have a step by step tutorial for a new clean 20.04.4 LTS Ubuntu installation? I'm also having a RX6800XT but wasn't able to get it running with the tutorials provided. Is it even possible without docker? Normally i'm a windows user and not very experienced with linux, sorry for that.

stull1 avatar Aug 24 '22 17:08 stull1

I've created a detailed tutorial on how I got stable diffusion working on my AMD 6800XT GPU.

I am currently trying to get it running on Windows through pytorch-directml, but am currently stuck. Hopefully your tutorial will point me in a direction for Windows.

jdluzen avatar Aug 24 '22 22:08 jdluzen

does anyone else experiencing rick rolls most of the time? I'm running SD on a RX 6800 XT with params, which don't even fill all of the available VRAM. As this happens on sfw prompts like an astronout riding a horse, I've removed the safety checker but then I'm still getting pure black images.

I'm print()ing samples_dimm here. If the resulting image is black, samples_dimm is a tensor filled with nans

xeaon avatar Aug 27 '22 09:08 xeaon

does anyone else experiencing rick rolls most of the time? I'm running SD on a RX 6800 XT with params, which don't even fill all of the available VRAM. As this happens on sfw prompts like an astronout riding a horse, I've removed the safety checker but then I'm still getting pure black images.

I'm print()ing samples_dimm here. If the resulting image is black, samples_dimm is a tensor filled with nans

The rick-roll images are the NSFW filter kicking in. I'm not exactly sure how it detect NSFW images, but it seems to detect a lot of human skin-tone colors and then tag the image as nsfw. You can disable the NSFW filter by tweaking a few lines of the python code. Now, on my 6800xt I get no rick-rolls. The black images are a sign that you probably didn't remove the nsfw filter properly. Try using this method to remove the filter.

gururise avatar Aug 28 '22 18:08 gururise

@jdluzen https://github.com/harishanand95/diffusers/blob/dml/examples/inference/readme.md#instructions-for-onnx-dml-execution-on-windows Please try this and let me know, I got 512x512 image generation on windows with directml

harishanand95 avatar Aug 30 '22 17:08 harishanand95

@jdluzen https://github.com/harishanand95/diffusers/tree/main/examples/inference#instructions-for-onnx-dml-execution-on-windows Please try this and let me know, I got 512x512 image generation on windows with directml

its works for me! Thanks! Windows 10 v1904 RX 5500 XT 8gb 16gb ram

num_inference_steps=75; guidance_scale=7.5; time of render 5 min

but long text prompt throws an error

example: prompt = "Anthropomorphic d 2 0 triangle head in opal muscular danny devito holding spinach, intricate, elegant, highly detailed homer popey, digital painting, artstation, concept art, sharp focus, illustration, art by artgerm, bob eggleton, michael whelan, stephen hickman, richard corben, wayne barlowe, greg rutkowski, alphonse mucha, 8 k"

error.

\inference>python dml_onnx.py 2022-08-31 23:23:14.1598946 [E:onnxruntime:, sequential_executor.cc:369 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running Add node. Name:'Add_221' Status Message: C:\onnx\onnxruntime\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2049)\onnxruntime_pybind11_state.pyd!00007FFD8E887E68: (caller: 00007FFD8E85E380) Exception(2) tid(14bc) 80070057 ¤ ¤ ¤ ¤ ¤ ¤ ¤ ¤  ¤ ¤ ¤ ¤ ¤  ¤ ¤ ¤ ¤ ¤ ¤ ¤ .

Traceback (most recent call last): File "dml_onnx.py", line 213, in image = pipe(prompt, height=512, width=512, guidance_scale=7.5, num_inference_steps=75, execution_provider="DmlExecutionProvider")["sample"][0] File "C:\Users\Kuler\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "dml_onnx.py", line 97, in call if onnx: text_embeddings = encoder_sess.run(None, {"text_input": text_input.input_ids.numpy()})[0] File "C:\Users\Kuler\AppData\Local\Programs\Python\Python38\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 200, in run return self._sess.run(output_names, input_feed, run_options) onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException

KulerRuler avatar Aug 31 '22 20:08 KulerRuler

@jdluzen https://github.com/harishanand95/diffusers/tree/main/examples/inference#instructions-for-onnx-dml-execution-on-windows Please try this and let me know, I got 512x512 image generation on windows with directml

it works! Getting around 3.5s/it on a 5700XT. I also appreciate that the models are in ONNX, as I am more familiar with the ecosystem.

jdluzen avatar Sep 01 '22 00:09 jdluzen

Thanks, FYI I just updated the url to point to dml branch than main.. So the new link is https://github.com/harishanand95/diffusers/blob/dml/examples/inference/readme.md#instructions-for-onnx-dml-execution-on-windows. I had previously committed to main branch :/. Anyways, I'll try to get some of these changes in diffusers library than maintaining in fork.

harishanand95 avatar Sep 01 '22 00:09 harishanand95

@jdluzen

Getting around 3.5s/it on a 5700XT

I wonder what are the correponding inference time for a 3060 ti or 3070 ti

Also, is this leveraging FP16/FP8 hardware on AMD ?

LifeIsStrange avatar Sep 01 '22 17:09 LifeIsStrange

I've made a guide on how to run it on a Windows machine with an AMD GPU, enjoy!

Thanks for the guide! The only one I managed to get to work. I have a 6700XT and running at around 1.6s/it.

Now it's time I learn a bit of python. XD

sleepless-ninja avatar Sep 03 '22 03:09 sleepless-ninja

How would one do this, but using the Stable Diffusion webUI instead of command line?

LinuxForEveryone avatar Sep 03 '22 14:09 LinuxForEveryone

I've made a guide on how to run it on a Windows machine with an AMD GPU, enjoy!

This worked for me, thx. I optimized the script a little bit for multiple outputs and random seeds. I get around 1.0s/it with a 6800XT. Is it possible to change the model for example to 1.5 when it's released? And what about img2img?

stull1 avatar Sep 06 '22 15:09 stull1

does anyone else experiencing rick rolls most of the time? I'm running SD on a RX 6800 XT with params, which don't even fill all of the available VRAM. As this happens on sfw prompts like an astronout riding a horse, I've removed the safety checker but then I'm still getting pure black images. I'm print()ing samples_dimm here. If the resulting image is black, samples_dimm is a tensor filled with nans

The rick-roll images are the NSFW filter kicking in. I'm not exactly sure how it detect NSFW images, but it seems to detect a lot of human skin-tone colors and then tag the image as nsfw. You can disable the NSFW filter by tweaking a few lines of the python code. Now, on my 6800xt I get no rick-rolls. The black images are a sign that you probably didn't remove the nsfw filter properly. Try using this method to remove the filter.

This wasn't the issue. As I've stated in my comment, the safetychecker was disabled. My card was slightly undervolted/overclocked and as it seems, it crashed softly during image creation. Turning settings back to normal eliminated black images.

xeaon avatar Sep 08 '22 13:09 xeaon

I've made a guide on how to run it on a Windows machine with an AMD GPU, enjoy!

Works for me as well, on a 2018 MacBook Pro running Bootcamp w/a Radeon VII in a Razer Core X. Getting about 2s per iteration ... I ran into a memory error when trying to run save_onyx.py at 1024x1024. I have 32gb ram. So I can only generate at lower resolutions.

RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 17179869184 bytes.

The highest so far is 768x768, but my s/it increases to 45. That's a massive decrease in speed for a small bump in resolution. Feels like something is wrong.

Also, if you get an error like this when running save_onyx.py:

OSError: It looks like the config file at 'C:\Users\rapha/.cache\huggingface\transformers\9c24e6cd9f499d02c4f21a033736dabd365962dc80fe3aeb57a8f85ea45a20a3.26fead7ea4f0f843f6eb4055dfd25693f1a71f3c6871b184042d4b126244e142' is not a valid JSON file.

... just open that file, run it through a json validator, & try again. Looks like someone doesn't know how to dump json properly :P

raphaelmatto avatar Sep 09 '22 16:09 raphaelmatto

Guys im getting an error post_quant_conv.onnx no exist folowing this tutorial : https://rentry.co/ayymd-stable-diffustion-v1_4-guide

Anyone here can help me run with radeon rx 6500 xt?

Help and thanks.

chaos4455 avatar Sep 09 '22 21:09 chaos4455

Hi @chaos4455 @raphaelmatto and others, diffusers folks updated their code to support onnx now, so you don't need to use my branch/repo to work. Here are the steps to install it on windows, https://gist.github.com/harishanand95/75f4515e6187a6aa3261af6ac6f61269 on linux, https://huggingface.co/CompVis/stable-diffusion-v1-4/discussions/29#630e49a583f64e3516785431

harishanand95 avatar Sep 13 '22 16:09 harishanand95

First off, thanks to everyone on this thread for getting stable diffusion working on AMD. I took the work of harishanand95, a blog post at https://www.travelneil.com/stable-diffusion-windows-amd.html, and my own ideas to create a script that replicates some of the argument options found in this repo's txt2img.py script. Not all the argument options are available but the basics work, like image size, # of samples, # of steps, seed #, and guidance scale. I also added the option to render using the CPU and to request a random seed. Check it out at the link below. Follow harishanand95's instructions above for getting everything setup.

https://github.com/agizmo/ONNX-txt2img

agizmo avatar Sep 18 '22 01:09 agizmo

does anybody knows how to run save_onnx.py with a already downloaded model checkpoint?

maikelsz avatar Oct 03 '22 15:10 maikelsz

harishanand95 didn't work for me

(ldm) bla@blaLT:~/projects/stable-diffusion$ python3 scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms

/home/bla/miniconda3/envs/ldm/lib/python3.8/site-packages/torch/cuda/init.py:83: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.) return torch._C._cuda_getDeviceCount() > 0 Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: (...)

RuntimeError: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice

maniac-0s avatar Oct 04 '22 14:10 maniac-0s