stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Possible Nvidia driver issues

Open w-e-w opened this issue 1 year ago • 9 comments

Discussed in https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/11062

Originally posted by w-e-w June 7, 2023 some users have reported some issues related to the latest Nvidia drivers nVidia drivers change in memory management vladmandic#1285 https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/11050#issuecomment-1578731478 if you have been experiencing generation slowdowns or getting stuck, consider downgrading to driver version 531 or below NVIDIA Driver Downloads

w-e-w avatar Jun 06 '23 16:06 w-e-w

Funny, I am randomly getting the issues where an output is stuck at 50% for an hour, and I am on 531.41 for an NVIDIIA 3060 12GB model

tusharbhutt avatar Jun 07 '23 14:06 tusharbhutt

Strangely mine seems to go at normal speed for the first gen on a checkpoint, or if I change the clip on a checkpoint, but subsequent gens go muuuch slower. Annoyingly Diablo won't run on 531.

chaewai avatar Jun 07 '23 19:06 chaewai

I can confirm this bug. I was getting results (as expected) before I installed the latest Titan RTX drivers. I will try installing a previous build.

designborg avatar Jun 11 '23 15:06 designborg

Strangely mine seems to go at normal speed for the first gen on a checkpoint, or if I change the clip on a checkpoint, but subsequent gens go muuuch slower. Annoyingly Diablo won't run on 531.

Yeah, that's exactly how it is for me. When I tried inpainting, the first gen runs through just fine, but any subsequent ones have massive hang-ups, necessitating a restart of the commandline window and rerunning webui-user.bat.

AIDungeonTester2 avatar Jun 17 '23 00:06 AIDungeonTester2

I wasn't sure if there was a problem with the drivers so I reinstalled WebUI, but the problem didn't go away. To think, everything generates fine like before, but once the High Res Fix starts and finishes, it looks like a minute pause. Edit: confirming. Downgraded to 531.68. Now everything as it was

younyokel avatar Jun 19 '23 14:06 younyokel

If you are stuck with a newer Nvidia driver version, downgrading to Torch 1.13.1 seems to work too.

  1. Add the following to webui-user.bat: set TORCH_COMMAND=pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
  2. Remove <webui-root>/venv directory
  3. (Re)start WebUI

hearmeneigh avatar Jun 25 '23 20:06 hearmeneigh

I am having the opposite issue where on the newer drivers my first image generation is slow because of some clogged memory on my GPU which frees itself as soon as it gets to the second one. image

Downgrading Torch didn't seem to help at all. Downgrading from 536.23 to 531.79 fixes the problem instantly.

Shawdooow avatar Jun 25 '23 22:06 Shawdooow

Anyone, is this problem still relevant?

younyokel avatar Jul 03 '23 12:07 younyokel

I haven't tried with the latest drivers, so I don't know if this issue is still ongoing.

designborg avatar Jul 03 '23 15:07 designborg

Extremely slow for me. Downgraded the pytorch, and had a whole lotta of new problems. What usually took 4h is taking 10+

PsychoGarlic avatar Jul 03 '23 23:07 PsychoGarlic

Please tell me there is a fix in the pipeline?

invaderxan1 avatar Jul 05 '23 07:07 invaderxan1

For pro graphics (at least for my A4000) 531 is not going to help with eliminating the issue. Need to downgrade to at least 529 to get rid of the shared memory usage. And both 529 / 531 / 535 / 536 in production brunch are working way worse, than 531 at new feature (uses shared VRAM, but way smaller footprint for some reason)

LabunskyA avatar Jul 05 '23 22:07 LabunskyA

Can confirm this is still an issue, I have a RTX 3080 TI and downgrading to 531.68 solved it for me.

RobotsHoldingHands avatar Jul 08 '23 08:07 RobotsHoldingHands

I'm using a 3070, torch: 2.0.1+cu118, and can confirm that this is still an issue with the 536.40 driver. Using highres.fix in particular makes everything break once you reach 98% progress on an image.

Detrian avatar Jul 09 '23 10:07 Detrian

It got a tiny bit better here. torch 1.13.1+cu117. 531.79. Cuda compilation tools release 12.0, V12.0.76

Still having issues with the duration of the generations. Usually, 200 frames took 4h, and now it is taking 10 (720x1280, 30 steps, 2~3 controlNets). Don't know how to fix it properly. Every other fix I did, severely damaged the quality of the images. I now know that I was using the 1.2.1 version of the webUI and the torch was not 2.0. Every other setting I do not remember. Now I have everything written somewhere hahahah

Em dom., 9 de jul. de 2023 às 11:10, Detrian @.***> escreveu:

I'm using a 3070, torch: 2.0.1+cu118, and can confirm that this is still an issue with the 536.40 driver. Using highres.fix in particular makes everything break once you reach 98% progress on an image.

— Reply to this email directly, view it on GitHub https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/11063#issuecomment-1627668465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQXED22T44Y6R7JGQJVB5LDXPJ7RZANCNFSM6AAAAAAY4U6YGE . You are receiving this because you commented.Message ID: @.***>

PsychoGarlic avatar Jul 09 '23 10:07 PsychoGarlic

536.67 fixed this? or not?

dajusha avatar Jul 18 '23 23:07 dajusha

I did not try it. A lot of wasted time already hahjaja

Em qua., 19 de jul. de 2023 às 00:12, dajusha @.***> escreveu:

536.67 fixed this? or not?

— Reply to this email directly, view it on GitHub https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/11063#issuecomment-1641104904, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQXED2ZHD4ROQFAXGYNLTKLXQ4J7HANCNFSM6AAAAAAY4U6YGE . You are receiving this because you commented.Message ID: @.***>

PsychoGarlic avatar Jul 19 '23 23:07 PsychoGarlic

536.67 fixed it for me.

WhiteX avatar Jul 20 '23 12:07 WhiteX

536.67 also worked for me somewhat, meaning it still seems to drop to shared memory but not as aggressively (latest versions seem to start using shared memory at 10GB rather than fully maxing out all available 12GB which matters.

The 536.67 driver release notes still references shared memory, and I recently started getting the "hanging at 50% bug" again today after updating some plugins which prompted me to dig a bit deeper for some solutions.

I often use 2 or 3 ControlNet 1.1 models + Hi-res Fix upscaling on a 12GB card which is what triggers it if I watch my Performance tab and see the GPU begin to use shared CPU memory.

The ideal fix would be finding some way to create a --never-use-migraine-inducing-shared-memory flag, but I assume this would rely on some driver or operating system API to become available after some light research as there doesn't seem to be a way to "block" a specific process from using shared memory.

However, for the good news - I was able to massively reduce this >12GB memory usage without resorting to --medvram with the following steps:

Initial environment baseline

  1. Check your CLI to make sure you don't have any "using old xformers" WARN message (not sure if this is actually related but it was part of the process, so makes sense to include it)
  2. Add set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512 to webui-user.bat
  3. I assume here, 12GB users are already running the flags --xformers and --opt-split-attention.

Biggest improvement

Assuming your environment already looks similar to the above, by far the biggest VRAM drop I found was switching from the 1.4GB unpruned .pth ControlNet 1.1 models to these 750MB pruned .safetensors versions https://civitai.com/models/38784

Hope this helps anyone in a similar frustrating position 😁

prescience-data avatar Jul 24 '23 01:07 prescience-data

From my understanding ComfyUI might've done something with CUDA's malloc to fix this. https://github.com/comfyanonymous/ComfyUI/commit/1679abd86d944521cad8a94a09d30fd5e238ae22

Looks like a lot of cards also don't support this though: https://github.com/search?q=repo%3Acomfyanonymous%2FComfyUI+malloc&type=commits&s=author-date&o=desc

catboxanon avatar Aug 03 '23 01:08 catboxanon

536.67 also did not fix this, according to the release notes.

https://us.download.nvidia.com/Windows/536.67/536.67-win11-win10-release-notes.pdf

3.2 Open Issues in Version 536.67 WHQL

This driver implements a fix for creative application stability issues seen during heavy memory usage. We’ve observed some situations where this fix has resulted in performance degradation when running Stable Diffusion and DaVinci Resolve. This will be addressed in an upcoming driver release. [4172676]

catboxanon avatar Aug 06 '23 19:08 catboxanon

I updated the drivers without thinking this might happen and now I can't go back. I have tried removing the drivers with "Display Driver Uninstaller" and then installing v531.68 and v528.49 , but it still doesn't go as fast as before. RTX 4080 (Laptop) 12GB. I seem to be missing something.

Edit: finally my problem seems to be with the laptop itself. Yesterday I was testing 536.67 and 536.99 on my desktop using RTX 3080 with no problems.

david-trigo avatar Aug 07 '23 17:08 david-trigo

After downgrading to 531.79 i noticed that it was using very slightly less ram, but was slower. So i downgraded to 531.18 but cant see any difference to 536.67 other then aforementioned less ram usage.

TeKett avatar Aug 08 '23 18:08 TeKett

win10 latest, sd-webui 1.5.1, model: sdxl 1.0, image size 1024x1024

my experience yesterday with nvidia 531.79 gen 4 images under a minute on a 3090

my experience today with nvidia 536.57 1 image :23 sec 2 images 8 minutes 3 images 8 minutes 4 images 8 minutes

going to uninstall 536.57 and install 531.79

FilipeF12 avatar Aug 08 '23 21:08 FilipeF12

536.99 just released, with the open issue mentioned prior still there, but the mention of Stable Diffusion seemingly vanished. (It was given a reference number of 4172676 as mentioned here)

https://us.download.nvidia.com/Windows/536.99/536.99-win11-win10-release-notes.pdf

3.2 Open Issues in Version 536.99 WHQL

[DaVinci Resolve] This driver implements a fix for creative application stability issues seen during heavy memory usage. We’ve observed some situations where this fix has resulted in performance degradation when running DaVinci Resolve. This will be addressed in an upcoming driver release. [4172676]

catboxanon avatar Aug 08 '23 22:08 catboxanon

Has anyone tried 536.99?

jieran233 avatar Aug 09 '23 04:08 jieran233

I have tried 536.99. It has the same higher vram usage. All i can say, since i dont share your people's issues with newer versions. After rolling back to 531 i am getting the freezing but its pretty sporadic. 536 works just fine, i do xyz plots of several hundred generations. I have a rtx 4070 ti.

TeKett avatar Aug 09 '23 07:08 TeKett

Has anyone tried 536.99?

I just installed 536.99 using RTX 3080 and so far it's working fine.

david-trigo avatar Aug 09 '23 08:08 david-trigo

I have tried 536.99. It has the same higher vram usage. All i can say, since i dont share your people's issues with newer versions. After rolling back to 531 i am getting the freezing but its pretty sporadic, 536 works just fine, i do xyz plots of several hundred generations. I have a rtx 4070 ti.

you mention 'rolling back to 531'. nvidia advises against using roll back. they say to uninstall the updated driver, and then download and install the old driver.

FilipeF12 avatar Aug 09 '23 14:08 FilipeF12

win10 latest, sd-webui 1.5.1, model: sdxl 1.0, image size 1024x1024

my experience yesterday with nvidia 531.79 gen 4 images under a minute on a 3090

my experience today with nvidia 536.57 1 image :23 sec 2 images 8 minutes 3 images 8 minutes 4 images 8 minutes

going to uninstall 536.57 and install 531.79

after uninstalling 536 and installing 531 I am back to the speeds I had before

FilipeF12 avatar Aug 09 '23 14:08 FilipeF12