stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Feature Request]: The greatest feature request in history.

Open aifartist opened this issue 1 year ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Please, for the love of all that is holy, print the darn torch version, cu1xx version, cudnn version and perhaps the xformers version if used.

A significant amount of efforts has been expended day after day, particularly on Windows with people not getting the cuDNN fix quite right and not seeing any perf improvements. I started a thread here about Torch 2.0 being release which I hope would solve that problem once and for all. I also started a thread on reddit about this and there are so many questions regarding whether the cudnn libraries are in the right place, being used, or got undone by the installation of xformers.

Yes, there are ways to check but some people aren't command line savvy. It is so easy to make the change I propose below. As part of these please move the most important line(ready to connect to http://127.0.0.1:7860) back to being the last line printed before A1111 is ready to use.

Yes, there are other problems but this class of problems I've described come up over and over again. Don't ask a naive user to run some command to check. Do the check for them within the actual python process running to guarantee the correct answer.

Proposed workflow

Instead of:

...
Textual inversion embeddings loaded(0): 
Model loaded in 2.2s (create model: 0.2s, apply weights to model: 0.9s, apply half(): 0.1s, load VAE: 0.7s, move model to device: 0.2s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 4.8s (import gradio: 1.1s, import ldm: 0.4s, other imports: 0.8s, load scripts: 0.1s, load SD checkpoint: 2.3s).

Do this instead:

...
Textual inversion embeddings loaded(0): 

Model loaded in 2.2s (create model: 0.2s, apply weights to model: 0.9s, apply half(): 0.1s, load VAE: 0.7s, move model to device: 0.2s).
Startup time: 4.8s (import gradio: 1.1s, import ldm: 0.4s, other imports: 0.8s, load scripts: 0.1s, load SD checkpoint: 2.3s).

Using torch=2.0.0+cu118, cuda=11.8, cudnn=8700, xformers=NAhis 

To create a public link, set `share=True` in `launch()`.
Ready for requests at URL:  http://127.0.0.1:7860

Additional information

This data is obtainable by: torch.__version__ torch.version.cuda torch.backends.cudnn.version() xformers ???

aifartist avatar Mar 17 '23 19:03 aifartist

It might be nice to have that in the command line output indeed, but versions are already available at the bottom of the webUI page: image

Cykyrios avatar Mar 17 '23 20:03 Cykyrios

It might be nice to have that in the command line output indeed, but versions are already available at the bottom of the webUI page: image

That is good too. Where is the cuDNN version? It seems to be the major issue with the huge perf impact for 4090's and significant different for other GPU. Just today someone else finally sorted out this issue the hard way when he found his process was loading the lib's from somewhere else. That simple piece of info solved the problem. This issue comes up over and over again. I wish I could not only get the cuDNN version, which is easy, but the DLL load path. "process explorer from MS sysinternals can easily fetch who is using what dll and from where" but I don't know if there is a easy python wrapper for it. But just confirm whether you are or are NOT using cuDNN 8.7 is the single most import thing to check first. If you are not then you did NOT apply the fix correctly. And, this issue has generated far too much noise. I'm tired of helping Windows users.

aifartist avatar Mar 17 '23 21:03 aifartist

I'm not using CUDA as I upgraded my old GTX1060 to an AMD 7900 XT a couple of months ago, only to find out that these new cards are not yet supported by ROCm... so I currently run on CPU, which is about 3x slower than the GTX1060 was. cud version would appear instead of rocm5.2 in my screenshot.

Cykyrios avatar Mar 17 '23 21:03 Cykyrios

At the end of the day there is a lot of noise printed by the webui startup scripts. I'd like to reduce that AND get the versions of the primary packages printed near the end of startup where they'll be visible. They can also be displayed on the web page itself.

aifartist avatar Mar 17 '23 22:03 aifartist