stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: problems with SD after installing CUDA?

Open atimogus opened this issue 6 months ago • 1 comments

Checklist

  • [X] The issue exists after disabling all extensions
  • [X] The issue exists on a clean installation of webui
  • [ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • [ ] The issue exists in the current version of the webui
  • [X] The issue has not been reported before recently
  • [ ] The issue has been reported before but has not been fixed yet

OS

Windows 10

What version did you experience this issue on?

SD 1.5

What happened?

i have had this arguments before and SD was working fine

set COMMANDLINE_ARGS=--medvram --xformers --force-enable-xformers --always-batch-cond-uncond --opt-channelslast --no-hashing --disable-nan-check --api --xformers-flash-attention --opt-split-attention --no-half-vae set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6, max_split_size_mb:32

i was downloading different versions of CUDA because i needed it for tensorflow project (i realised that tf doesnt work on win anymore) so i started running tf on WSL and now i have newest CUDA on my pc. After 2 months of not using SD i opened it again and tried to generate few picutes, i was getting this error (with args i mentioned above)

RuntimeError: mat1 and mat2 must have the same dtype

NO ARGUMENTS

so i tried removing all arguments then i can see in preview that image is generating but i get this output

NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check. `

ONLY --no-half IN ARGUMENTS

then i added --no-half

Snimka zaslona 2023-12-17 003601 after this picture i tried other sampling methods still the same (but with euler sampling i get: NansException: A tensor with all NaNs was produced in Unet. Use --disable-nan-check commandline argument to disable this check.)

--no-half-vae --no-half --disable-nan-check

now i try this arguents set COMMANDLINE_ARGS= --no-half-vae --no-half --disable-nan-check

Snimka zaslona 2023-12-17 001557

i tried changing models, it did not fix problem, did upcast cross attention layer to float32" option in Settings i have the same generated pictures

GPU: RTX 3060 12GB Vram every time i changed arguments i deleted venv i also tried installing new nvidia drivers but now i tink --xformers doesnt work but that is not main problem i also tried updating from 1.6.0 to 1.7.0 but did not help

What browsers do you use to access the UI ?

Brave, i have tried with multiple browsers but i get the same issue

atimogus avatar Dec 17 '23 13:12 atimogus

i guess i solved my problem, i have generated about 60 pics with great consistency

i went to safe mode and did DDU, after entering in windows i uninstalled any nvidia software i had on my pc (through control pannel) and then i restartred my pc and installed newest gpu drivers, after restart i tried stable diffusion and it is working i hope so

edit1:

after i tought i fixed the problem i downloaded tensorRT and controlnet (while installing tensorRT i got some errors about cudnn) so i reinstalled nvidia drivers again but still the same issue but now sometimes i get black/gray images

i get this error NansException: A tensor with all NaNs was produced in Unet. Use --disable-nan-check commandline argument to disable this check.

edit2:

so my theory is that tensorRT is downlaoding some cursed CUDA and now i downloaded CUDA and DDU drivers and uninstall all NVIDIA sofrware through control pannel i am able to generate 20+ images without any fault https://developer.nvidia.com/cuda-toolkit-archive

args i have used when writing this comment set COMMANDLINE_ARGS= --no-half

Snimka zaslona 2023-12-18 023923

edit3:

i put back my old args

set COMMANDLINE_ARGS=--medvram --xformers --force-enable-xformers --always-batch-cond-uncond --opt-channelslast --no-hashing --disable-nan-check --api --xformers-flash-attention --opt-split-attention --no-half-vae set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6, max_split_size_mb:32

and everything is working fine, i just realised that xformers doesnt give any more speed while generating images, i guss that new xformers needs newer version of pytorch, i will try to downgrade

atimogus avatar Dec 17 '23 20:12 atimogus

Does restarting solve your issue? Because it generally does for me. I personally suspect it's some sort of weird memory glitch. Sometimes models will work fine, I switch from models to model and go back to the original one and I'll get NansException. It's inconsistent, and only appears to go away after a long time or restarting the system.

agentx3 avatar Jan 01 '24 05:01 agentx3

yes, sometimes restarting did resolve issue, but not for long (maybe 50 genetared images it again appears) but after i installed right cuda and everything it is working perfectly fine i did generate about 1000 images and i am not getting any error

atimogus avatar Jan 01 '24 19:01 atimogus

Installing new CUDA didn't fix it for me sadly.

Piprian avatar Jan 06 '24 15:01 Piprian