diffusers how to use the controlnet sdxl tile model in diffusers

Describe the bug

I want to use this model to make my slightly blurry photos clear, so i found this model. I follow the code here , but as the model mentioned above is XL not 1.5 , so i change the code, but it error.

Reproduction

import torch from PIL import Image from diffusers import ControlNetModel, DiffusionPipeline, StableDiffusionXLControlNetPipeline

def resize_for_condition_image(input_image: Image, resolution: int): input_image = input_image.convert("RGB") W, H = input_image.size k = float(resolution) / min(H, W) H *= k W *= k H = int(round(H / 64.0)) * 64 W = int(round(W / 64.0)) * 64 img = input_image.resize((W, H), resample=Image.LANCZOS) return img

controlnet = ControlNetModel.from_pretrained('/mnt/asian-t2i/pretrained_models/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1', torch_dtype=torch.float16, use_safetensors = True)

pipe = DiffusionPipeline.from_pretrained("/mnt/asian-t2i/pretrained_models/RealVisXL_V3.0", custom_pipeline="stable_diffusion_controlnet_img2img", controlnet=controlnet, torch_dtype=torch.float16,).to('cuda')

pipe.enable_xformers_memory_efficient_attention()

source_image = Image.open("/mnt/asian-t2i/data/luchuan/1024/0410-redbook-luchuan-6.jpg")

condition_image = resize_for_condition_image(source_image, 1024)

image = pipe( prompt="best quality", negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", image=condition_image, controlnet_conditioning_image=condition_image, width=condition_image.size[0], height=condition_image.size[1], strength=1.0, generator=torch.manual_seed(0), num_inference_steps=32, ).images[0]

image.save('output.png')

Logs

/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:678: FutureWarning: 'cached_download' is the legacy way to download files from the HF hub, please consider upgrading to 'hf_hub_download'
  warnings.warn(
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  2.00it/s]
You have disabled the safety checker for <class 'diffusers_modules.git.stable_diffusion_controlnet_img2img.StableDiffusionControlNetImg2ImgPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
  0%|                                                                                                                                                             | 0/32 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/asian-t2i/demo.py", line 31, in <module>
    image = pipe(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/.cache/huggingface/modules/diffusers_modules/git/stable_diffusion_controlnet_img2img.py", line 839, in __call__
    down_block_res_samples, mid_block_res_sample = self.controlnet(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/asian-t2i/diffusers/src/diffusers/models/controlnet.py", line 775, in forward
    if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable

System Info

Name: diffusers Version: 0.27.0.dev0

Who can help?

@sayakpaul @yiyixuxu @DN6

Apr 11 '24 03:04 xinli2008

please give me a favor if you have free time

Apr 11 '24 03:04 xinli2008

Cc: @asomoza any tips?

Apr 11 '24 03:04 sayakpaul

Why are you not using https://github.com/huggingface/diffusers/blob/aa1f00fd0182baf22800e27ccd9a55016e1eb4b4/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl_img2img.py#L160

directly?

Apr 11 '24 03:04 sayakpaul

code:

import torch from PIL import Image from diffusers import ControlNetModel, DiffusionPipeline, StableDiffusionXLControlNetPipeline

def resize_for_condition_image(input_image: Image, resolution: int): input_image = input_image.convert("RGB") W, H = input_image.size k = float(resolution) / min(H, W) H *= k W *= k H = int(round(H / 64.0)) * 64 W = int(round(W / 64.0)) * 64 img = input_image.resize((W, H), resample=Image.LANCZOS) return img

controlnet = ControlNetModel.from_pretrained('/mnt/asian-t2i/pretrained_models/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1', torch_dtype=torch.float16, use_safetensors = True)

pipe = StableDiffusionXLControlNetPipeline.from_pretrained( "/mnt/asian-t2i/pretrained_models/RealVisXL_V3.0", controlnet=controlnet, local_files_only = True, torch_dtype=torch.float16, ) pipe.enable_model_cpu_offload()

source_image = Image.open("/mnt/asian-t2i/data/luchuan/链接1-裁剪-1024/0410-redbook-luchuan-6.jpg") condition_image = resize_for_condition_image(source_image, 1024)

images = pipe( "best quality", negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", image=condition_image, controlnet_conditioning_image=condition_image, ).images

issues: root@67def1b0e1c0:/mnt/asian-t2i# python3 demo.py Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00, 3.91it/s] Traceback (most recent call last): File "/mnt/asian-t2i/demo.py", line 33, in images = pipe( File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/mnt/asian-t2i/diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl.py", line 1197, in call ) = self.encode_prompt( File "/mnt/asian-t2i/diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl.py", line 329, in encode_prompt text_inputs = tokenizer( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2872, in call encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs) File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2958, in _call_one return self.batch_encode_plus( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3149, in batch_encode_plus return self._batch_encode_plus( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 807, in _batch_encode_plus batch_outputs = self._batch_prepare_for_model( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 879, in _batch_prepare_for_model batch_outputs = self.pad( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3356, in pad outputs = self._pad( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3725, in _pad encoded_inputs["attention_mask"] = encoded_inputs["attention_mask"] + [0] * difference OverflowError: cannot fit 'int' into an index-sized integer

Apr 11 '24 03:04 xinli2008

We don't have access to the model checkpoints you're referring to here. In particular: pretrained_models/RealVisXL_V3.0.

Apr 11 '24 03:04 sayakpaul

it is just a sdxl-model ,you can use it with : here

Apr 11 '24 03:04 xinli2008

Hi, since you're saying to use any SDXL model, I used this one: https://huggingface.co/SG161222/RealVisXL_V4.0 and using the StableDiffusionXLControlNetPipeline

Didn't have any errors or problems:

original	generated

From what I can see in your error log the problem is with your local files, also this is kind of strange in your first error log:

File "/root/.cache/huggingface/modules/diffusers_modules/git/stable_diffusion_controlnet_img2img.py", line 839, in __call__ down_block_res_samples, mid_block_res_sample = self.controlnet(

In the code you posted you were using the StableDiffusionXLControlNetPipeline but that error shows stable_diffusion_controlnet_img2img which doesn't make sense.

You need to double check that you're using the correct pipeline and that the local files exists since you're not using the hub. The controlnet works, if you want I can post the code but it's the same as yours just that I have the correct paths.

Edit: I see now, that you're using a custom pipeline, don't use that, just use the normal pipeline.

Apr 11 '24 04:04 asomoza

I'll post the code anyway so if more people encounter this problem can use it.

Also the original repo doesn't have the right filenames, I intend to do a guide with this and a hires fix equivalent, so I'll ask the model owner if he can create the files with the correct filenames at a later date.

import torch

from diffusers import ControlNetModel, DPMSolverMultistepScheduler, StableDiffusionXLControlNetPipeline
from diffusers.utils import load_image


controlnet = ControlNetModel.from_pretrained(
    "OzzyGT/SDXL_Controlnet_Tile_Realistic", torch_dtype=torch.float16, variant="fp16"
)

pipeline = StableDiffusionXLControlNetPipeline.from_pretrained(
    "SG161222/RealVisXL_V4.0",
    torch_dtype=torch.float16,
    variant="fp16",
    controlnet=controlnet,
).to("cuda")

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)

control_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/20240401162107_lq.png?download=true"
)


prompt = "high quality image of a car"
negative_prompt = "blurry, low quality"

image = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    guidance_scale=7.5,
    num_inference_steps=25,
    image=control_image,
    controlnet_conditioning_scale=1.0,
).images[0]

image.save("result.png")

Apr 11 '24 04:04 asomoza

thank you and i will try that

Apr 11 '24 04:04 xinli2008

what is your version of transformers ? I think my error is the version of transformers

Apr 11 '24 04:04 xinli2008

diffusers version: 0.28.0.dev0
Platform: Linux-6.8.4-arch1-1-x86_64-with-glibc2.39
Python version: 3.11.8
PyTorch version (GPU?): 2.2.2+cu121 (True)
Huggingface_hub version: 0.20.3
Transformers version: 4.39.2
Accelerate version: 0.28.0
xFormers version: not installed

Apr 11 '24 05:04 asomoza

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

May 11 '24 15:05 github-actions[bot]

follow up to this-- I'm a bit confused on how exactly the tiling works, after reading the diffusers controlnet pipeline code. There is no specification on the number of tiles/tile sizing here, so what exactly is going on with this controlnet?

Jun 18 '24 22:06 AbhinavGopal

it is called controlnet tile but it actually doesn't do anything related to tiles, it just add details, change them or fixes blurry images as you can see in my example.

This kind of controlnet it's used a lot with upscaling and tiling because it's ideal for that, the number of tiles or the strategy to do it, it's all done in the code on top of it.

Jun 19 '24 17:06 asomoza

Has it been resolved?

Jun 20 '24 10:06 hjj-lmx

Since the original user didn't post any more questions, we can assume yes.

Jun 20 '24 15:06 asomoza

Closing this issue because of inactivity. Feel free to reopen.

Jun 29 '24 13:06 sayakpaul

diffusers diffusers copied to clipboard

how to use the controlnet sdxl tile model in diffusers

Describe the bug

Reproduction

Logs

System Info

Who can help?

diffusers
diffusers copied to clipboard