stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: Expected all tensors to be on the same device
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
When generating an image for the first time, i get an error which is an Out of Memory. So after that, i try to downgrade my image, but get another error message saying that.
Traceback (most recent call last): File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\call_queue.py", line 56, in f res = list(func(*args, **kwargs)) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img processed = process_images(p) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 486, in process_images res = process_images_inner(p) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 625, in process_images_inner uc = get_conds_with_caching(prompt_parser.get_learned_conditioning, negative_prompts, p.steps, cached_uc) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 570, in get_conds_with_caching cache[1] = function(shared.sd_model, required_prompts, steps) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\prompt_parser.py", line 140, in get_learned_conditioning conds = model.get_learned_conditioning(texts) File "D:\Stable Diffusion 2\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 669, in get_learned_conditioning c = self.cond_stage_model(c) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_clip.py", line 229, in forward z = self.process_tokens(tokens, multipliers) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_clip.py", line 254, in process_tokens z = self.encode_with_transformers(tokens) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_clip.py", line 302, in encode_with_transformers outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=-opts.CLIP_stop_at_last_layers) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1212, in _call_impl result = forward_call(*input, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 811, in forward return self.text_model( File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 708, in forward hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 223, in forward inputs_embeds = self.token_embedding(input_ids) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack.py", line 234, in forward inputs_embeds = self.wrapped(input_ids) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\sparse.py", line 160, in forward return F.embedding( File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\functional.py", line 2210, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper__index_select)
i already tried to find this issue since it has been there for quite a long time, but since i'm not really someone who know how coding works, i don't really know what to do.
my webui-user has indeed --medvram and it can create this problem, but i can't use --lowvram since i want to try having better images, and can't remove this --medvram since my config is not enough to run it without it.
Steps to reproduce the problem
- go to your webui-user.bat and modify it
- In COMMANDLINE_ARGS, place "--xformers --precision full --no-half --medvram --always-batch-cond-uncond --opt-split-attention --opt-sub-quad-attention" inside of it. 3.Run the .bat
- Try to get an out of memory CUDA
- Generate another image and you should get the error.
What should have happened?
Well, it should have just be able to generate the image, or told me if it was again an Out of Memory. Instead, i got the other error.
Commit where the problem happens
i don't know where the problem happens
What platforms do you use to access the UI ?
Windows
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
COMMANDLINE_ARGS = --xformers --precision full --no-half --medvram --always-batch-cond-uncond --opt-split-attention --opt-sub-quad-attention
List of extensions
LDSR Lora ScuNET SwinIR prompt-bracket-checker
Console logs
Traceback (most recent call last):
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img
processed = process_images(p)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 486, in process_images
res = process_images_inner(p)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 636, in process_images_inner
samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 836, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 351, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 227, in launch_sampling
return func()
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 351, in <lambda>
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 138, in forward
x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond={"c_crossattn": c_crossattn, "c_concat": [image_cond_in[a:b]]})
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
return self.__orig_func(*args, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1201, in _call_impl
result = hook(self, input)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\lowvram.py", line 35, in send_me_to_gpu
module.to(devices.device)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\pytorch_lightning\core\mixins\device_dtype_mixin.py", line 113, in to
return super().to(*args, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 989, in to return self._apply(convert)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 641, in _apply
module._apply(fn)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 641, in _apply
module._apply(fn)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 641, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 664, in _apply
param_applied = fn(param)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 987, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 2.00 GiB total capacity; 1.66 GiB already allocated; 0 bytes free; 1.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img
processed = process_images(p)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 486, in process_images
res = process_images_inner(p)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 625, in process_images_inner
uc = get_conds_with_caching(prompt_parser.get_learned_conditioning, negative_prompts, p.steps, cached_uc)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\processing.py", line 570, in get_conds_with_caching
cache[1] = function(shared.sd_model, required_prompts, steps)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\prompt_parser.py", line 140, in get_learned_conditioning
conds = model.get_learned_conditioning(texts)
File "D:\Stable Diffusion 2\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 669, in get_learned_conditioning
c = self.cond_stage_model(c)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_clip.py", line 229, in forward
z = self.process_tokens(tokens, multipliers)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_clip.py", line 254, in process_tokens
z = self.encode_with_transformers(tokens)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack_clip.py", line 302, in encode_with_transformers
outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=-opts.CLIP_stop_at_last_layers)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1212, in _call_impl
result = forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 811, in forward
return self.text_model(
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 708, in forward
hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 223, in forward
inputs_embeds = self.token_embedding(input_ids)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\modules\sd_hijack.py", line 234, in forward
inputs_embeds = self.wrapped(input_ids)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\sparse.py", line 160, in forward
return F.embedding(
File "D:\Stable Diffusion 2\stable-diffusion-webui\venv\lib\site-packages\torch\nn\functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper__index_select)
Additional information
It is possible, i think, that it loses the device i'm using after the second generation because of the first error, i'm not really sure about that
try this set CUDA_VISIBLE_DEVICES=0
,
not --device-id
CUDA_VISIBLE_DEVICES=0
where should I set CUDA_VISIBLE_DEVICES=0
I had this error immediately after downloading the openpose editor extension and fixing my controlnet config issues.
Once I ran the webui with the commandline arg "--lowvram", I was able to run controlnet with openpose with "low vram" checked just fine (albeit extremely slowly). After running both controlnet and the a1111 in lowvram mode. In my case this was a GPU memory issue.
same issues here, i have two GPUs running into two instances, BUT happen once, shows the same errors messages.
I'm not sure but seems that happens when a Lora model made for SDXL is mixed with other non-SDXL models or when we use diffusers for SDXL with non-ones and vice-versa.
I have this issue too. I tried 3 communities and nobody is providing any help to fix this issue really.
I have this same problem. In the vladmatic fork they solved it by adding the "--cuda" flag so I used their project for a while but came back to automatic1111 to see how things were going and this problem persisted. So, I chattered with GPT and the solution that worked for me was adding this to webui.py and also stable-diffusion-webui/modules/call_queue.py.
import torch
device = torch.device('cuda:1' if torch.cuda.is_available() else 'cpu')
My problem is I want to use my 2nd gpu for dedicated SDXL so I run:
CUDA_VISIBLE_DEVICES=1 ./webui.sh --no-half --no-half-vae
But that throws the error about detecting two tensor devices because my cpu has a gpu built in (I think that's the issue at least). Anyways this got the program loaded and working for me as the cuda visible devices environmental argument by itself wasn't solving the problem since it ignores the existence of the CPU/CPU w/GPU.
edit: I've seen a few random errors kicking back the same tensors error, so I just throw those two lines into that .py file and it seems to work. I'm sure there is a more universal place to put this information, but my ignorance is infinite.
Try this,
change modules/initialize.py
line 154:
from:
Thread(target=load_model).start()
to:
m_thread = Thread(target=load_model)
m_thread.start()
m_thread.join()