stable-diffusion-webui
stable-diffusion-webui copied to clipboard
update torch base environment
this pr is a single-step update of pytorch
base environment:
- from
torch
1.13.1 withcuda
11.7 andcudnn
8.5.0 - to
torch
2.0.0 withcuda
11.8 andcudnn
8.7.0
this allows usage of sdp
cross-optimization, better multi-gpu support with accelerate
and avoids large number of performance issues due to broken cudnn
in some environments
it updates all required packaged, but avoids any prereleases:
-
torchvision
(plus silence future deprecation warning) -
xformers
(update follows torch) -
accelerate
(required to support new torch) -
numpy
(update of numpy is required by new accelerate)
note:
- since
accelerate
changed format of the config file to avoid warnings (non-critical), runaccelerate config
once - collab updated to torch 2.0 so having webui still using older torch is causing issues for users running webui in hosted environments
yes, updating torch is a major step, but will have to be done sooner or later as there are more and more reports of issues installing old torch version
Testing these changes out - things seem to work "out of the box",
but I still get the No module 'xformers'. Proceeding without it
message when starting up
Not sure if this can be ignored, as it's seemingly included in torch 2
but I still get the
No module 'xformers'. Proceeding without it
message when starting up
its not "included", its just not necessary given new sdp
is available
(depending on the use-case, low-end gpus are still better with xformers
).
the remaining message comes from external repo - repositories/stable-diffusion-stability-ai
and removing warning would cause that repo to get out-of-sync. and unfortunately, its not posted with logger so it can be filtered out, but a simple print statement.
have you tested these changes on unix? runpod?
runpod
linux yes. runpod no. there are thousands of gpu cloud providers, cannot test each one like that.
Yes, at some point will have to migrate to torch 2.0 since newer xformer wheels require it.
runpod
linux yes. runpod no. there are thousands of gpu cloud providers, cannot test each one like that.
ok list me 20 :)
anyway i am just saying that covering as many as possible widely used scenario is good
Yes, at some point will have to migrate to torch 2.0 since newer xformer wheels require it.
correct and i solved this problem with downloading and uploading torch 1 version wheel 0.0.18dev489. they are also still compiling them thankfully. i think automatic1111 can do same way. the wheel and such things can be hosted on hugging face i think . currently they removed all 0.0.14 and 0.0.17 for torch 1 from pip installation.
Yes, at some point will have to migrate to torch 2.0 since newer xformer wheels require it.
This is true only for wheels posted to pypi. You can find a wide range of pre-built xformers wheel builds in their Github action artifacts, if you still needed a wheel for older torch. Not at simple as keeping up to date via pypi, but useful in a pinch.
Screenshot example of available builds below:
Just keep in mind you need to be logged into Github to download artifacts.
Yes, at some point will have to migrate to torch 2.0 since newer xformer wheels require it.
Just keep in mind you need to be logged into Github to download artifacts.
sorry, can we use discussions for this and keep pr comments as pr comments? i'd love to collect/implement anything thats required, but this is not pr related at all.
I'm running into a really strange problem. Any advice of how I should narrow down the root cause?
Edit: Oops, forgot to say my startup arguments:
--xformers --opt-channelslast --no-half-vae
I was just trying this PR out (as-is, plus exporting xformers==0.0.18 in launch.py).
Everything upgraded and ran smoothly for the most part, but when I tried to generate a larger image (e.g. 1024x1024), I realized there is a problem for me -- It seemed to hang at 100% GPU for 4 minutes! The sampler steps had completed, but the image had not saved to file yet. Then after 4 mins, the image finally completed and saved.
After this, the problem goes away if I generate more at the same resolution (until the WebUI process is restarted).
However, if I change the image resolution to anything different, e.g. to 1024x1088, the same very delay happens, and again only for the first run at that resolution.
After investigating, I realized there were also delays for smaller images, but the delay grows exponentially as resolution scales up.
Here is a quick table showing times I measured. Note: These measurements also reflect the typical 'warm-up' time before steps progress that was already a little annoying. After warm-up, the gen time for small images is much faster.
First Runs w/ 5 steps (Euler a):
Gen Time = time for all steps completed
Size | Gen Time | Total Time |
---|---|---|
512x512 | 0:03 | 0:04 |
640x640 | 0:01 | 0:06 |
704x704 | 0:02 | 0:08 |
768x768 | 0:02 | 0:09 |
832x832 | 0:03 | 0:11 |
896x832 | 0:03 | 0:12 |
896x896 | 0:03 | 2:23 |
1024x1024 | 0:04 | 4:14 |
Here are some screenshots of the output I saw
System: Win10, RTX 2070S 8GB, Intel 3770k
Version info:
@EfourC Very strange problem, since the recent update is not stable, you can try to see if this problem is reproduced on the old version, for example git checkout a9fed7c
.
With that commit, and the current master, the problem doesn't happen -- I only see it after everything is upgraded for Torch2.
Behavior Ok:
I did some more permutations of testing, especially to see if --opt-sdp-attention
instead of xformers made a difference with the Torch 2 venv.
What I found out is that the problem is actually --opt-channelslast
causing the massive delay for me in with Torch 2.
Both of these startup args work ok:
--xformers --no-half-vae
--opt-sdp-attention --no-half-vae
Using --opt-channelslast
with either of the above creates the problem delay for me.
I haven't looked at (or used previously) any of the other performance optimization switches, but it's probably worth people trying them out on different types of systems (since I blundered into an issue with this one). https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Command-Line-Arguments-and-Settings
Honestly if there is going to be a move to Torch 2.0.0, it should wait until after Torch 2.0.1 is released as there is currently a major bug that made it into GA that breaks compatibility with WebUI when using torch.compile. See: https://github.com/pytorch/pytorch/pull/97862 and https://github.com/pytorch/pytorch/issues/93405
I'm aware of that issue, but WebUI does not use torch.compile on its own and anyone that is experienced enough to use it would hand pick torch version manually anyhow.
Torch 2.1 has no benefits for normal WebUI user. And existing Torch 1.13 is showing it's teeth with quite a few install issues lately.
Whole point of the PR is not to enable experimental use, but to make it simpler for normal users.
fyi, i've initially updated xformers
to 0.0.18, but there are frequent reports of NaN values, especially during hires operations, so i've downgraded to 0.0.17. performance wise, i don't see any major difference, so this is not a big loss. like i said before, goal of this PR is to get cleanest out-of-the-box environment where least number of users have issues, not just go with latest&greatest.
Torch 2.1 has no benefits for normal WebUI user. And existing Torch 1.13 is showing it's teeth with quite a few install issues lately.
Whole point of the PR is not to enable experimental use, but to make it simpler for normal users.
It isn't 2.1, we aren't waiting a whole major release, Torch 2.0.1 came out of phase 0 yesterday. I still believe that Torch 2.0 should not be merged until the blocking issue upstream is resolved in the next minor update as I believe PyTorch botched the initial GA release of 2.0, and we shouldn't be running that version of Pytorch until it is more mature.
Torch 2.1 has no benefits for normal WebUI user. And existing Torch 1.13 is showing it's teeth with quite a few install issues lately. Whole point of the PR is not to enable experimental use, but to make it simpler for normal users.
It isn't 2.1, we aren't waiting a whole major release, Torch 2.0.1 came out of phase 0 yesterday. I still believe that Torch 2.0 should not be merged until the blocking issue upstream is resolved in the next minor update as I believe PyTorch botched the initial GA release of 2.0, and we shouldn't be running that version of Pytorch until it is more mature.
and then we'd have to wait for xformers
to publish new wheels, etc...
again, torch.compile
is not used by webui so there is no benefit standard user of waiting for torch 2.0.1.
and collab upgraded to torch 2.0 and so did many other hosted environments, so right now running webui requires additional manual steps - which is far more important to resolve than "wait for ideal version".
and then we'd have to wait for
xformers
to publish new wheels, etc... again,torch.compile
is not used by webui so there is no benefit standard user of waiting for torch 2.0.1. and collab upgraded to torch 2.0 and so did many other hosted environments, so right now running webui requires additional manual steps - which is far more important to resolve than "wait for ideal version".
As it stands right now, the only people you are claiming are effected are people using cloud setups, which most likely have already done the manual work to support PyTorch 2.0.0. There is no reason for PyTorch to be upgraded to 2.0.0 when it is very clearly NOT stable. It is not worth risking adding even more bugs to the code base as it currently stands.
very clearly NOT stable
that is a very strong statement. can you substantiate this? all errors i've seen so far have been related to torch.compile
and yes, that feature is pretty much broken.
on the other hand, there are hundreds users using torch 2.0 with webui without issues.
that is a very strong statement. can you substantiate this?
To name a few: https://github.com/pytorch/pytorch/issues/97031 https://github.com/pytorch/pytorch/issues/97041 https://github.com/pytorch/pytorch/issues/97226 https://github.com/pytorch/pytorch/issues/97576 https://github.com/pytorch/pytorch/issues/97021
And not only that, I disagree with moving to 2.0.0 on principle as .0.0 software is generally never stable. Waiting for 2.0.1 has no downsides whereas 2.0.0 is an unstable mess that they are still trying to get stable. The last thing this repo needs is more instability which causes more issues to flood in.
that is a very strong statement. can you substantiate this?
To name a few: pytorch/pytorch#97031 pytorch/pytorch#97041 pytorch/pytorch#97226 pytorch/pytorch#97576 pytorch/pytorch#97021
Bugs relevant to WebUI are what matters - why list random things - this is going in a wrong direction?
- For example, I don't think anyone is trying to run it on RaspberryPi, lets stay on topic?
- And WebUI uses
venv
, notconda
, soconda
onosx-64
also doesn't really apply. - Or does
torchvision
have debug symbols or not? I'd consider that cosmetic issue at best. - Yes,
libnvrtc
packaging bug is relevant, but there is no PR associated with it, so should we wait until indefinitely?
And not only that, I disagree with moving to 2.0.0 on principle as .0.0 software is generally never stable. Waiting for 2.0.1 or even 2.0.2 has no downsides whereas 2.0.0 is an unstable mess that they are still trying to get stable. The last thing this repo needs is more instability which causes more issues to flood in.
That is a question of personal preference and risk vs reward. Issue is that Torch 1.13 wheels are getting obsoleted in many packages and/or environment causing failures to install. So whats the solution? Ignore current issues until some unknown time in the future?
PR stands as-is and I've been using Torch 2.0 on my branch for a while now (and users on my branch are not reporting issues relevant to WebUI). We can agree to disagree here.
Convolutions being broken for Cuda 11.8 builds specifically affects users who use Pytorch 2.0 and --opt-channelslast. It basically negates any possible performance benefits from that option.
Recently PyTorch changed it's install command, it now uses --index-url instead of --extra-index-url, as mentioned in #9483
Also, i noticed your code doesn't cover AMD cards (in that cate TORCH_COMMAND is set in webui.sh But it's fine, i had some problems on my 5700XT on pytorch2 (see #8139) and already covered that part in #9404, i feel it's better to stay on 1.31.1 a few more, at least on AMD.
Been having a lot of issues trying to get things working with my 5700XT. The new torch 2.0 version failed to generate any images.
Using the latest rocm as below failed to generate images or any console output:
export TORCH_COMMAND="pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2"
Using the torch command from the pr works:
export TORCH_COMMAND="pip install torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2 --index-url https://download.pytorch.org/whl/rocm5.2"
At least it is working on Commit hash: a9fed7c364061ae6efb37f797b6b522cb3cf7aa2
rocm 5.6.0 alpha is out and it brings torch 2.0 compatibility, i'd be curious if that works.
5.6.0?maybe that will contain my 7900xtx
rocm 5.6.0 alpha is out and it brings torch 2.0 compatibility, i'd be curious if that works.
Really? Where? I see 5.4.3 as the last realease on https://github.com/RadeonOpenCompute/ROCm/releases
rocm 5.6.0 alpha is out and it brings torch 2.0 compatibility, i'd be curious if that works.
Really? Where? I see 5.4.3 as the last realease on https://github.com/RadeonOpenCompute/ROCm/releases
https://rocmdocs.amd.com/projects/alpha/en/develop/deploy/install.html
rocm 5.6.0 alpha is out and it brings torch 2.0 compatibility, i'd be curious if that works.
Really? Where? I see 5.4.3 as the last realease on https://github.com/RadeonOpenCompute/ROCm/releases
https://rocmdocs.amd.com/projects/alpha/en/develop/deploy/install.html
uhm... it doen't seem that's publicly available
5.6.0?maybe that will contain my 7900xtx
There was indeed a docker image for rocm 5.6.0 with 7900xtx support around, but it's now offline so i guess that code was intended for internal testing and not supposed to be released yet. Anyway, there was a discussion here #9591
I'm not sure if that works on other gpus like the 5700xt too. but i wouldn't be surprised if pytorch2.0 starts to work when the next rocm version will be released.
I guess for 5700xt users the better choice is sticking to the old 1.13.1 version and wait for an official rocm release
I'm pretty sure --extra-index-url https://download.pytorch.org/whl/cu118
for OSX is wrong but I don't have a mac to try it on.