stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: RuntimeError with Refiner Enabled and Batch Count > 1 in img2img (note: refiner works when batch=1)
Checklist
- [X] The issue exists after disabling all extensions
- [x] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
- [X] The issue exists in the current version of the webui
- [X] The issue has not been reported before recently
- [ ] The issue has been reported before but has not been fixed yet
What happened?
When enabling the "Refiner" in the img2img tab with a "Batch Count" greater than 1, the application crashes, throwing a RuntimeError indicating a device mismatch between CPU and CUDA. Did a complete git clone
and new setup to double-check the issue occurred on a fresh installation.
Steps to reproduce the problem
- Launch SD
- Navigate to img2img tab.
- Enable the "Refiner".
- Set "Batch Count" to a value greater than 1.
- Attempt to process >1 images
What should have happened?
The application should process multiple images in a batch with the refiner enabled without encountering a device mismatch error.
What browsers do you use to access the UI ?
Mozilla Firefox, Google Chrome
Sysinfo
Console logs
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
___
Full console log / stack-trace
Using already loaded custom-sdxl.safetensors [4d161dc67e]: done in 4.1s (send model to cpu: 2.1s, send model to device: 2.1s)2.22s/it]
2024-01-11 18:57:10,637 - ControlNet - INFO - unit_separate = False, style_align = False
2024-01-11 18:57:10,801 - ControlNet - INFO - Loading model from cache: diffusers_xl_depth_full [2f51180b]
2024-01-11 18:57:10,804 - ControlNet - INFO - Loading preprocessor: depth
2024-01-11 18:57:10,805 - ControlNet - INFO - preprocessor resolution = 1604
2024-01-11 18:57:10,865 - ControlNet - INFO - ControlNet Hooked - Time = 0.23399829864501953
0%| | 0/49 [00:00<?, ?it/s]
Restoring base VAE
Applying attention optimization: xformers... done.
VAE weights loaded.
*** Error completing request
*** Arguments: ('task(2i8qc2q0e4mpuk0)', 0, 'positive prompt', 'negative prompts', [], <PIL.Image.Image image mode=RGBA size=1600x2000 at 0x27D89FC8A90>, None, None, None, None, None, None, 75, 'DPM++ 3M SDE Karras', 4, 0, 1, 2, 1, 9, 1.5, 0.65, 0, 2000, 1600, 1, 0, 0, 32, 0, '', '', '', ['VAE: sdxl-model.vae.safetensors'], False, [], '', <gradio.routes.Request object at 0x0000027D89B88670>, 0, True, 'sd_xl_refiner_1.0.safetensors [7440042bbd]', 0.9, -1, False, -1, 0, 0, 0, <scripts.animatediff_ui.AnimateDiffProcess object at 0x0000027D89B8BFD0>, UiControlNetUnit(enabled=True, module='depth_midas', model='diffusers_xl_depth_full [2f51180b]', weight=0.55, image={'image': array([[[148, 117, 90],
*** [156, 123, 93],
*** [160, 124, 93],
*** ...,
*** [222, 178, 142],
*** [222, 178, 142],
*** [223, 183, 151]],
***
*** [[152, 120, 89],
*** [150, 116, 88],
*** [151, 115, 88],
*** ...,
*** [218, 170, 134],
*** [217, 169, 132],
*** [217, 172, 139]],
***
*** [[155, 123, 94],
*** [154, 120, 92],
*** [155, 121, 92],
*** ...,
*** [210, 158, 123],
*** [211, 158, 122],
*** [209, 159, 125]],
***
*** ...,
***
*** [[199, 201, 202],
*** [198, 202, 202],
*** [199, 204, 204],
*** ...,
*** [177, 182, 186],
*** [178, 182, 187],
*** [176, 180, 185]],
***
*** [[200, 203, 204],
*** [197, 201, 200],
*** [196, 200, 200],
*** ...,
*** [172, 176, 181],
*** [170, 173, 179],
*** [168, 170, 177]],
***
*** [[193, 195, 194],
*** [194, 195, 197],
*** [195, 197, 198],
*** ...,
*** [159, 163, 167],
*** [160, 163, 167],
*** [159, 161, 166]]], dtype=uint8), 'mask': array([[[0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0],
*** ...,
*** [0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0]],
***
*** [[0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0],
*** ...,
*** [0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0]],
***
*** [[0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0],
*** ...,
*** [0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0]],
***
*** ...,
***
*** [[0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0],
*** ...,
*** [0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0]],
***
*** [[0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0],
*** ...,
*** [0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0]],
***
*** [[0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0],
*** ...,
*** [0, 0, 0],
*** [0, 0, 0],
*** [0, 0, 0]]], dtype=uint8)}, resize_mode='Crop and Resize', low_vram=False, processor_res=512, threshold_a=-1, threshold_b=-1, guidance_start=0.08, guidance_end=0.6, pixel_perfect=True, control_mode='Balanced', inpaint_crop_input_image=True, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), '* `CFG Scale` should be 2 or lower.', True, True, '', '', True, 50, True, 1, 0, False, 4, 0.5, 'Linear', 'None', '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, 'start', '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50, '<p style="margin-bottom:0.75em">Will upscale the image depending on the selected target size type</p>', 512, 0, 8, 32, 64, 0.35, 32, 0, True, 0, False, 8, 0, 0, 2048, 2048, 2) {}
Traceback (most recent call last):
File "C:\automatic1111-sd-webui\modules\call_queue.py", line 57, in f
res = list(func(*args, **kwargs))
File "C:\automatic1111-sd-webui\modules\call_queue.py", line 36, in f
res = func(*args, **kwargs)
File "C:\automatic1111-sd-webui\modules\img2img.py", line 238, in img2img
processed = process_images(p)
File "C:\automatic1111-sd-webui\modules\processing.py", line 734, in process_images
res = process_images_inner(p)
File "C:\automatic1111-sd-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "C:\automatic1111-sd-webui\modules\processing.py", line 868, in process_images_inner
samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
File "C:\automatic1111-sd-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 435, in process_sample
return process.sample_before_CN_hack(*args, **kwargs)
File "C:\automatic1111-sd-webui\modules\processing.py", line 1527, in sample
samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)
File "C:\automatic1111-sd-webui\modules\sd_samplers_kdiffusion.py", line 188, in sample_img2img
samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "C:\automatic1111-sd-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
return func()
File "C:\automatic1111-sd-webui\modules\sd_samplers_kdiffusion.py", line 188, in <lambda>
samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\automatic1111-sd-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 668, in sample_dpmpp_3m_sde
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\automatic1111-sd-webui\modules\sd_samplers_cfg_denoiser.py", line 188, in forward
x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond=make_condition_dict(c_crossattn, image_cond_in[a:b]))
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\automatic1111-sd-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "C:\automatic1111-sd-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "C:\automatic1111-sd-webui\modules\sd_models_xl.py", line 37, in apply_model
return self.model(x, t, cond)
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\automatic1111-sd-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "C:\automatic1111-sd-webui\modules\sd_hijack_utils.py", line 28, in __call__
return self.__orig_func(*args, **kwargs)
File "C:\automatic1111-sd-webui\repositories\generative-models\sgm\modules\diffusionmodules\wrappers.py", line 28, in forward
return self.diffusion_model(
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\automatic1111-sd-webui\modules\sd_unet.py", line 91, in UNetModel_forward
return original_forward(self, x, timesteps, context, *args, **kwargs)
File "C:\automatic1111-sd-webui\repositories\generative-models\sgm\modules\diffusionmodules\openaimodel.py", line 984, in forward
emb = self.time_embed(t_emb)
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
input = module(input)
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\automatic1111-sd-webui\extensions-builtin\Lora\networks.py", line 486, in network_Linear_forward
return originals.Linear_forward(self, input)
File "C:\automatic1111-sd-webui\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
---
Additional information
- GPU: NVIDIA GeForce RTX 3090 with 24 GB RAM.
- Checkpoint model is 6.46 GBs, VAE + stock Refiner easily fits in 24GBs
- Issue seems related to the model management in batch processing, potentially within the reuse_model_from_already_loaded function or related model handling logic.
- GPU drivers are up to date.
Any chance we can fix this? It's a particularly annoying bug that only happens in img2img and prevents using simple 4x4 batch runs, etc. it has to be worked around with "batch size" but that of course has a reasonable limit one can use.
I am having the same problem. After hours of trial and error, I narrowed it down to the use of the refiner. I'm having the same errors as OP. I have a 24GB VRAM card, if that's any help.