stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: GPU 3060 not used, Error ar startup

Open elsucht opened this issue 1 year ago • 14 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

For image generation the CPU is active even nvidia 3060 12GB is present. Different python project (based on fastai and torch) can use the GPU.

Steps to reproduce the problem

bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)

What should have happened?

image should be generated by using GPU

Commit where the problem happens

ea9bd9f

What platforms do you use to access the UI ?

Linux

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

No

List of extensions

No

Console logs

Traceback (most recent call last):
  File "/home/christian/stable-diffusion-webui/launch.py", line 360, in <module>
    prepare_environment()
  File "/home/christian/stable-diffusion-webui/launch.py", line 272, in prepare_environment
    run_python("import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'")
  File "/home/christian/stable-diffusion-webui/launch.py", line 129, in run_python
    return run(f'"{python}" -c "{code}"', desc, errdesc)
  File "/home/christian/stable-diffusion-webui/launch.py", line 105, in run
    raise RuntimeError(message)
RuntimeError: Error running command.
Command: "/home/christian/stable-diffusion-webui/venv/bin/python3" -c "import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'"
Error code: 1
stdout: <empty>
stderr: /home/christian/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.)
  return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AssertionError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

Additional information

Tested in the python virtual environment with source ./venv/bin/activate

python -c "import torch; print(torch.cuda.is_available())"
/home/christian/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.)
  return torch._C._cuda_getDeviceCount() > 0
False

but

python -c "import torch; print(torch.cuda.device_count())"
1

Another project using torch (using fastai) can use CUDA gpu successfully.

System info:

  • Ubuntu 22.04, Ryzen 7700x, 64GB RAM
  • RTX 3060 12GB
  • nvida smi info: Driver Version: 510.108.03 CUDA Version: 11.6

elsucht avatar Feb 06 '23 09:02 elsucht

Could it be related to the Ryzen builtin GPU? So technically I have 2 GPUs in my system, one AMD and one nvidia. Pretty sure that lines 107ff in webui.sh do not reflect the case of mixed gpu systems, do they? https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/webui.sh#L107-L122

elsucht avatar Feb 06 '23 10:02 elsucht

I'm also getting this error. Everything was fine last week. Now I can't start the project without the --skip-torch-cuda-test option, which seems to disable the GPU and makes everything ultra slow.

Please locate the recent changes that broke the tool and revert it to previous implementation.

System info: Windows 10, Ryzen 9 3900X, 32GB RAM RTX 3080 10GB NVIDIA-SMI 528.24 Driver Version: 528.24 CUDA Version: 12.0

ahernandezmiro avatar Feb 06 '23 16:02 ahernandezmiro

@ahernandezmiro Please open the webui.sh file and edit (comment out) line 121 to:

...
then
    #export TORCH_COMMAND="pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.2"
fi 
...

That fixed it for me. After that the correct pytorch version (supporting nvidia cuda) was installed and GPU was working for stable diffusion.

elsucht avatar Feb 06 '23 16:02 elsucht

@ahernandezmiro Please open the webui.sh file and edit (comment out) line 121 to:

...
then
    #export TORCH_COMMAND="pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.2"
fi 
...

That fixed it for me. After that the correct pytorch version (supporting nvidia cuda) was installed and GPU was working for stable diffusion.

This might fix the issue on Linux but I'm using SD on Windows, and since the launch script is different your fix cannot be applied to the bat script directly.

Good to know that the issue is indeed caused by the pytorch library, will try to dig deeper with that in mind.

ahernandezmiro avatar Feb 06 '23 18:02 ahernandezmiro

same issue with 3060 and ryzen on ubuntu 22.10.

the 121 line in webui.sh was already with "pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.2"

derVedro avatar Feb 11 '23 20:02 derVedro

@derVedro

same issue with 3060 and ryzen on ubuntu 22.10.

the 121 line in webui.sh was already with "pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.2"

you need to delete/deactivate that line. It is only relevant if you want to use a AMD GPU because it is installing some AMD Radeon specific torch packages.

elsucht avatar Feb 13 '23 08:02 elsucht

@elsucht you need to delete/deactivate that line. It is only relevant if you want to use a AMD GPU because it is installing some AMD Radeon specific torch packages.

yes, got it so far, just changed value of $gpu_info by deleting amd line.

derVedro avatar Feb 13 '23 23:02 derVedro

An important thing to note here if you are doing a fresh install -- you still need torch to be installed and you need to be sure you don't use the broken cached torch from a failed install.

This fixed it for me, replacing the lines 120-122 in webui.sh, fresh install, Ubuntu 22.04, gpu is a 3060:

then
    export TORCH_COMMAND="pip install torch torchvision --no-cache-dir"
fi 

--no-cache-dir prevents using a bad/broken library by accident and prevents the broken AMD torch version from being downloaded.

A simple 'if ' on failing the gpu check could fix this, might be worth a PR if someone wanted to do that

WyattAutomation avatar Feb 17 '23 18:02 WyattAutomation

Strangely running into this myself now. Was working fine this morning (3090ti), but I updated some extensions and restarted webui and now it fails with the same error as the OP

jwvanderbeck avatar Mar 03 '23 02:03 jwvanderbeck

Running in the same issue:

Running AMD ryzen 79xx with a 4090 and it just doesn't start up.

I will try to disable the internal amd gpu as it seems easier than analysing the startup script to see when and how the gpu is selected.

All tips haven't worked yet though

sigi-tw avatar Mar 12 '23 11:03 sigi-tw

I disabled the internal gpu and no change.

sigi-tw avatar Mar 12 '23 11:03 sigi-tw

Sry for the spam but deleting the venv folder lead to redownload and now it works

sigi-tw avatar Mar 12 '23 13:03 sigi-tw

then export TORCH_COMMAND="pip install torch torchvision --no-cache-dir" fi

This helped me with AMD Ryzen 5 5600G and GT-1660 super.

sakralbar avatar Mar 13 '23 09:03 sakralbar

My CPU is also AMD, GPU is Nvidia. Forgot to note.

WyattAutomation avatar Mar 13 '23 14:03 WyattAutomation

I got same and resolved with pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

Windows10 / GeForce 3060ti

tasuken avatar Sep 18 '23 06:09 tasuken