diffusers
diffusers copied to clipboard
Tensor size mismatch for non pow of 2 sized image, SD3ControlNetModel
Describe the bug
There seems to be an issue with certain non power of 2 sized control net guidance images when using SD3ControlNetModel
Reproduction
import diffusers
import PIL.Image
import os
import torch
os.environ['HF_TOKEN'] = 'your token'
cn = diffusers.SD3ControlNetModel.from_pretrained('InstantX/SD3-Controlnet-Canny')
pipe = diffusers.StableDiffusion3ControlNetPipeline.from_pretrained(
'stabilityai/stable-diffusion-3-medium-diffusers',
controlnet=cn)
pipe.enable_sequential_cpu_offload()
# aligned by 8, not a power of 2
output_size = (1376, 920)
not_pow_2 = PIL.Image.new('RGB', output_size)
args = {
'guidance_scale': 8.0,
'num_inference_steps': 30,
'width': output_size[0],
'height': output_size[1],
'control_image': not_pow_2,
'prompt': 'test prompt'
}
pipe(**args)
Logs
REDACT\venv\Lib\site-packages\diffusers\models\attention_processor.py:1584: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
hidden_states = F.scaled_dot_product_attention(
0%| | 0/30 [00:49<?, ?it/s]
Traceback (most recent call last):
File "REDACT\test.py", line 37, in <module>
pipe(**args)
File "REDACT\venv\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "REDACT\venv\Lib\site-packages\diffusers\pipelines\controlnet_sd3\pipeline_stable_diffusion_3_controlnet.py", line 1020, in __call__
latents = self.scheduler.step(noise_pred, t, latents, return_dict=False)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "REDACT\venv\Lib\site-packages\diffusers\schedulers\scheduling_flow_match_euler_discrete.py", line 268, in step
denoised = sample - model_output * sigma
~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (115) must match the size of tensor b (114) at non-singleton dimension 2
System Info
Platform: Windows
Python 3.12.3 diffusers 0.29.1 transformers 4.41.2 accelerate 0.31.0
Who can help?
@sayakpaul @yiyixuxu @DN6