sd-webui-controlnet icon indicating copy to clipboard operation
sd-webui-controlnet copied to clipboard

[Bug]:

Open SoundGuy opened this issue 2 years ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

What happened?

using Clip_Vision with the t2iadapter_sketch returns an error: RuntimeError: Tensors must have same number of dimensions: got 4 and 3

Steps to reproduce the problem

  1. Go to text2img
  2. add contolnet with clip_vision and t2iadapter_sketch
  3. generate

error

What should have happened?

it should generate image

Commit where the problem happens

webui: 3c922d983bf60ba187b5422b3690e6b7fb07777e controlnet: 4c13542c (Sun Mar 12 11:14:08 2023)

What browsers do you use to access the UI ?

No response

Command Line Arguments

--xformers

Console logs

Loading model from cache: t2iadapter_style_sd14v1 [202e85cc]
Loading preprocessor: none
  0%|                                                                                                                                   | 0/20 [00:00<?, ?it/s]
Error completing request
Arguments: ('task(4cyxqkvw6ey6fmn)', 'Redhead, beautiful, woman, spy, agent, modern disney style', 'background, out of frame, duplicate, watermark, signature, text, ugly, morbid, mutated, deformed, blurry, bad anatomy, bad proportions, cloned face, disfigured, fused fingers, fused limbs, too many fingers, long neck, ', [], 20, 0, True, False, 1, 1, 7, 1385296670.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, True, False, 0, -1, False, False, 1024, 1024, True, 64, 64, 32, 1, 'None', 2, False, False, False, True, True, 0, 960, 64, False, '', 0, False, True, True, 'canny', 'control_canny-fp16 [e3fe7712]', 1, {'image': array([[[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       ...,

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]]], dtype=uint8), 'mask': array([[[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       ...,

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]]], dtype=uint8)}, False, 'Scale to Fit (Inner Fit)', False, False, 512, 100, 200, 0, 1, False, True, 'none', 't2iadapter_style_sd14v1 [202e85cc]', 1, {'image': array([[[ 85,  69,  75],
        [ 83,  69,  76],
        [ 83,  68,  75],
        ...,
        [106,  94, 104],
        [106,  95, 106],
        [104,  94, 104]],

       [[ 86,  70,  77],
        [ 84,  69,  75],
        [ 84,  70,  76],
        ...,
        [106,  95, 105],
        [105,  95, 106],
        [105,  95, 106]],

       [[ 86,  69,  76],
        [ 84,  70,  75],
        [ 84,  70,  76],
        ...,
        [106,  95, 105],
        [106,  95, 104],
        [106,  94, 105]],

       ...,

       [[153, 149, 171],
        [153, 150, 171],
        [150, 147, 168],
        ...,
        [184, 182, 206],
        [183, 180, 205],
        [184, 182, 206]],

       [[152, 151, 172],
        [153, 152, 172],
        [153, 149, 170],
        ...,
        [185, 184, 207],
        [183, 180, 206],
        [186, 185, 210]],

       [[155, 153, 172],
        [150, 148, 170],
        [153, 148, 172],
        ...,
        [182, 178, 204],
        [184, 183, 206],
        [190, 190, 209]]], dtype=uint8), 'mask': array([[[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       ...,

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]]], dtype=uint8)}, False, 'Envelope (Outer Fit)', False, True, 64, 64, 64, 0, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, '', 5, 24, 12.5, 1000, '', 'DDIM', 0, 64, 64, '', 64, 7.5, 0.42, 'DDIM', 64, 64, 1, 0, 92, True, True, True, False, False, False, 'midas_v21_small', None, None, 50, 0, 0, 512, 512, False, False, True, True, True, False, False, 1, False, False, 2.5, 4, 0, False, 0, 1, False, False, 'u2net', False, False, False, False) {}
Traceback (most recent call last):
  File "G:\stable-diffusion-webui\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "G:\stable-diffusion-webui\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "G:\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "G:\stable-diffusion-webui\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "G:\stable-diffusion-webui\modules\processing.py", line 635, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "G:\stable-diffusion-webui\modules\processing.py", line 835, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "G:\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 119, in forward
    x_out = self.inner_model(x_in, sigma_in, cond={"c_crossattn": [cond_in], "c_concat": [image_cond_in]})
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "G:\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "G:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "G:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "G:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1329, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 233, in forward2
    return forward(*args, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 130, in forward
    control = param.control_model(x=x, hint=param.hint_cond, timesteps=timesteps, context=context)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\adapter.py", line 105, in forward
    self.control = self.control_model(hint_in)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\adapter.py", line 325, in forward
    x = torch.cat([x, style_embedding], dim=1)
RuntimeError: Tensors must have same number of dimensions: got 4 and 3



another attempt:

Loading preprocessor: clip_vision
  0%|                                                                                                                                   | 0/20 [00:00<?, ?it/s]
Error completing request
Arguments: ('task(vdjhn63rxvca669)', 'Redhead, beautiful, woman, spy, agent, modern disney style', 'background, out of frame, duplicate, watermark, signature, text, ugly, morbid, mutated, deformed, blurry, bad anatomy, bad proportions, cloned face, disfigured, fused fingers, fused limbs, too many fingers, long neck,', [], 20, 0, True, False, 1, 1, 7, 1385296670.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, True, False, 0, -1, False, False, 1024, 1024, True, 64, 64, 32, 1, 'None', 2, False, False, False, True, True, 0, 960, 64, False, '', 0, False, True, True, 'clip_vision', 't2iadapter_sketch-fp16 [75b15924]', 1, {'image': array([[[ 85,  69,  75],
        [ 83,  69,  76],
        [ 83,  68,  75],
        ...,
        [106,  94, 104],
        [106,  95, 106],
        [104,  94, 104]],

       [[ 86,  70,  77],
        [ 84,  69,  75],
        [ 84,  70,  76],
        ...,
        [106,  95, 105],
        [105,  95, 106],
        [105,  95, 106]],

       [[ 86,  69,  76],
        [ 84,  70,  75],
        [ 84,  70,  76],
        ...,
        [106,  95, 105],
        [106,  95, 104],
        [106,  94, 105]],

       ...,

       [[153, 149, 171],
        [153, 150, 171],
        [150, 147, 168],
        ...,
        [184, 182, 206],
        [183, 180, 205],
        [184, 182, 206]],

       [[152, 151, 172],
        [153, 152, 172],
        [153, 149, 170],
        ...,
        [185, 184, 207],
        [183, 180, 206],
        [186, 185, 210]],

       [[155, 153, 172],
        [150, 148, 170],
        [153, 148, 172],
        ...,
        [182, 178, 204],
        [184, 183, 206],
        [190, 190, 209]]], dtype=uint8), 'mask': array([[[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       ...,

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]]], dtype=uint8)}, False, 'Scale to Fit (Inner Fit)', False, False, 512, 64, 64, 0, 1, False, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 0, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, '', 5, 24, 12.5, 1000, '', 'DDIM', 0, 64, 64, '', 64, 7.5, 0.42, 'DDIM', 64, 64, 1, 0, 92, True, True, True, False, False, False, 'midas_v21_small', None, None, 50, 0, 0, 512, 512, False, False, True, True, True, False, False, 1, False, False, 2.5, 4, 0, False, 0, 1, False, False, 'u2net', False, False, False, False) {}
Traceback (most recent call last):
  File "G:\stable-diffusion-webui\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "G:\stable-diffusion-webui\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "G:\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "G:\stable-diffusion-webui\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "G:\stable-diffusion-webui\modules\processing.py", line 635, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "G:\stable-diffusion-webui\modules\processing.py", line 835, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "G:\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 119, in forward
    x_out = self.inner_model(x_in, sigma_in, cond={"c_crossattn": [cond_in], "c_concat": [image_cond_in]})
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "G:\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "G:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "G:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "G:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1329, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 233, in forward2
    return forward(*args, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 176, in forward
    control = param.control_model(x=x_in, hint=param.hint_cond, timesteps=timesteps, context=context)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\adapter.py", line 105, in forward
    self.control = self.control_model(hint_in)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\adapter.py", line 257, in forward
    x = self.unshuffle(x)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "g:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\pixelshuffle.py", line 104, in forward
    return F.pixel_unshuffle(input, self.downscale_factor)
RuntimeError: pixel_unshuffle expects height to be divisible by downscale_factor, but input.size(-2)=1 is not divisible by 8

Additional information

No response

SoundGuy avatar Mar 12 '23 15:03 SoundGuy

use none/sketch preproc. for sketch model, not clip_vision

Mikubill avatar Mar 12 '23 15:03 Mikubill