stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[BUG] Optimized mode is broken
Describe the bug Optimized mode is not working I'm getting error while generation after downloading latest version from master branch
To Reproduce Steps to reproduce the behavior:
- Add "--optimized" flag in relauncher.py
- Start webui
- Specify prompt and click "Generate"
- See error in logs
0%| | 0/50 [00:00<?, ?it/s]
!!Runtime error (txt2img)!!
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
exiting...calling os._exit(0)
Relauncher: Process is ending. Relaunching in 0.5s...
Desktop:
- OS: Windows 10
- Browser: Chrome
- GPU: RTX 2060 Super
- RAM: 16 GB
- CPU: Intel i5-9400F
It's happening after commit fe17340
I believe with optimized mode the model isn't getting transferred to the GPU at all now, hence the multiple devices error
Thanks so much, changed code by myself and now it works like a charm! Anyways, waiting for fix by @hlky
@athu16 @DenkingOfficial this was added a couple of hours ago it was supposed to be a fix for oom someone said, will revert it, sorry, I can only go off what optimized users are saying, I dont have a 4gb card
Works for me, I could not reproduce this bug... If model.cuda() is called I get OOM and the script doesn't even launch.
@oobabooga are you using the optimised switch?
@athu16 yes
model.cuda() is also never called on the basujindal version: https://github.com/basujindal/stable-diffusion/blob/main/optimizedSD/optimized_txt2img.py
It's happening after commit fe17340
I believe with optimized mode the model isn't getting transferred to the GPU at all now, hence the multiple devices error
i tried reverting that change locally too but i'm sharing @oobabooga's experience in that it doesn't even launch then.
Same error occurs per OP when using either the optimized tag or optimized-turbo.
GTX 1070 8gb
I've taken another look at that section and that commit seems to make very little sense after slightly closer inspection
That "if" could never succeed.
(I haven't done much programming in a long time so I hope I'm not being totally stupid.)
Ok so it's working for @DenkingOfficial @athu16 with fe17340 reverted How about @throwaway-mezzo-mix @AscendedGravity have you updated since I reverted that commit?
@oobabooga it doesn't work for you without fe17340 aka .cuda never called as @throwaway-mezzo-mix pointed out
so, what card do you have?
could everyone state their graphics card?
could everyone state their graphics card?
GTX 960 4GB
GTX 1650 4GB
Linux mint, running with --precision full --no-half
Same error occurs per OP when using either the optimized tag or optimized-turbo.
GTX 1070 8gb
With 8GB you can fit the entire model into VRAM at once. Optimized version exists to allow 4GB GPUs to run SD.
Also @hlky i've updated my webui.py now and reverting seemed to work, at least for me.
RTX 2060 Mobile 6GB
@AscendedGravity have you updated since I reverted that commit?
Did a manual revert of https://github.com/hlky/stable-diffusion-webui/commit/fe173407fe08cf496ef1607bbce3100f21bf4b3e and yes the optimized-turbo tag functions.
With 8GB you can fit the entire model into VRAM at once. Optimized version exists to allow 4GB GPUs to run SD.
While true, I can run larger image sizes and/or larger batch sizes with optimized.
If it works for everyone but me this issue can be closed and reopened later if someone else experiences the same issue. Maybe I'm doing something wrong.
Linux mint, running with
--precision full --no-half
@oobabooga Just to make sure, were you running with --optimized when testing this change? Because that's the only time it would affect anything. That might explain why your experience is different.
were you running with --optimized when testing this change
Yes
Yes
I'm out of ideas, then. Maybe someone else still has an idea?
Is it possible that some of you are running code from
https://github.com/hlky/stable-diffusion
and not
https://github.com/hlky/stable-diffusion-webui
?
@oobabooga possibly, I hadn't synced it to main, I have now. Just have to wait to see what people say now.
Just adding my experience here.
This repo works fine on windows, no issues with any functionality when running optimised mode on a 1050ti 4GB.
On arch, once again with optimised mode, I get OOM error on startup. With the "if not opt.optimised: model.cuda()" block in webui.py, the application starts, but I get the "found at least two devices, cuda:0 and cpu!" error from the OP when I try to run any prompt.
Thought the OS factor may be relevant to someone more knowledgeable than myself.