diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

how to use the controlnet sdxl tile model in diffusers

Open xinli2008 opened this issue 10 months ago • 12 comments

Describe the bug

I want to use this model to make my slightly blurry photos clear, so i found this model. I follow the code here , but as the model mentioned above is XL not 1.5 , so i change the code, but it error.

Reproduction

import torch from PIL import Image from diffusers import ControlNetModel, DiffusionPipeline, StableDiffusionXLControlNetPipeline

def resize_for_condition_image(input_image: Image, resolution: int): input_image = input_image.convert("RGB") W, H = input_image.size k = float(resolution) / min(H, W) H *= k W *= k H = int(round(H / 64.0)) * 64 W = int(round(W / 64.0)) * 64 img = input_image.resize((W, H), resample=Image.LANCZOS) return img

controlnet = ControlNetModel.from_pretrained('/mnt/asian-t2i/pretrained_models/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1', torch_dtype=torch.float16, use_safetensors = True)

pipe = DiffusionPipeline.from_pretrained("/mnt/asian-t2i/pretrained_models/RealVisXL_V3.0", custom_pipeline="stable_diffusion_controlnet_img2img", controlnet=controlnet, torch_dtype=torch.float16,).to('cuda')

pipe.enable_xformers_memory_efficient_attention()

source_image = Image.open("/mnt/asian-t2i/data/luchuan/1024/0410-redbook-luchuan-6.jpg")

condition_image = resize_for_condition_image(source_image, 1024)

image = pipe( prompt="best quality", negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", image=condition_image, controlnet_conditioning_image=condition_image, width=condition_image.size[0], height=condition_image.size[1], strength=1.0, generator=torch.manual_seed(0), num_inference_steps=32, ).images[0]

image.save('output.png')

Logs

/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:678: FutureWarning: 'cached_download' is the legacy way to download files from the HF hub, please consider upgrading to 'hf_hub_download'
  warnings.warn(
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  2.00it/s]
You have disabled the safety checker for <class 'diffusers_modules.git.stable_diffusion_controlnet_img2img.StableDiffusionControlNetImg2ImgPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
  0%|                                                                                                                                                             | 0/32 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/asian-t2i/demo.py", line 31, in <module>
    image = pipe(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/.cache/huggingface/modules/diffusers_modules/git/stable_diffusion_controlnet_img2img.py", line 839, in __call__
    down_block_res_samples, mid_block_res_sample = self.controlnet(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/asian-t2i/diffusers/src/diffusers/models/controlnet.py", line 775, in forward
    if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable

System Info

Name: diffusers Version: 0.27.0.dev0

Who can help?

@sayakpaul @yiyixuxu @DN6

xinli2008 avatar Apr 11 '24 03:04 xinli2008

please give me a favor if you have free time

xinli2008 avatar Apr 11 '24 03:04 xinli2008

Cc: @asomoza any tips?

sayakpaul avatar Apr 11 '24 03:04 sayakpaul

Why are you not using https://github.com/huggingface/diffusers/blob/aa1f00fd0182baf22800e27ccd9a55016e1eb4b4/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl_img2img.py#L160

directly?

sayakpaul avatar Apr 11 '24 03:04 sayakpaul

code:

import torch from PIL import Image from diffusers import ControlNetModel, DiffusionPipeline, StableDiffusionXLControlNetPipeline

def resize_for_condition_image(input_image: Image, resolution: int): input_image = input_image.convert("RGB") W, H = input_image.size k = float(resolution) / min(H, W) H *= k W *= k H = int(round(H / 64.0)) * 64 W = int(round(W / 64.0)) * 64 img = input_image.resize((W, H), resample=Image.LANCZOS) return img

controlnet = ControlNetModel.from_pretrained('/mnt/asian-t2i/pretrained_models/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1', torch_dtype=torch.float16, use_safetensors = True)

pipe = StableDiffusionXLControlNetPipeline.from_pretrained( "/mnt/asian-t2i/pretrained_models/RealVisXL_V3.0", controlnet=controlnet, local_files_only = True, torch_dtype=torch.float16, ) pipe.enable_model_cpu_offload()

source_image = Image.open("/mnt/asian-t2i/data/luchuan/链接1-裁剪-1024/0410-redbook-luchuan-6.jpg") condition_image = resize_for_condition_image(source_image, 1024)

images = pipe( "best quality", negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", image=condition_image, controlnet_conditioning_image=condition_image, ).images

issues: root@67def1b0e1c0:/mnt/asian-t2i# python3 demo.py Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00, 3.91it/s] Traceback (most recent call last): File "/mnt/asian-t2i/demo.py", line 33, in images = pipe( File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/mnt/asian-t2i/diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl.py", line 1197, in call ) = self.encode_prompt( File "/mnt/asian-t2i/diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl.py", line 329, in encode_prompt text_inputs = tokenizer( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2872, in call encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs) File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2958, in _call_one return self.batch_encode_plus( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3149, in batch_encode_plus return self._batch_encode_plus( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 807, in _batch_encode_plus batch_outputs = self._batch_prepare_for_model( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 879, in _batch_prepare_for_model batch_outputs = self.pad( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3356, in pad outputs = self._pad( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3725, in _pad encoded_inputs["attention_mask"] = encoded_inputs["attention_mask"] + [0] * difference OverflowError: cannot fit 'int' into an index-sized integer

xinli2008 avatar Apr 11 '24 03:04 xinli2008

We don't have access to the model checkpoints you're referring to here. In particular: pretrained_models/RealVisXL_V3.0.

sayakpaul avatar Apr 11 '24 03:04 sayakpaul

it is just a sdxl-model ,you can use it with : here

xinli2008 avatar Apr 11 '24 03:04 xinli2008

Hi, since you're saying to use any SDXL model, I used this one: https://huggingface.co/SG161222/RealVisXL_V4.0 and using the StableDiffusionXLControlNetPipeline

Didn't have any errors or problems:

original generated
20240401162107_lq 20240411000745

From what I can see in your error log the problem is with your local files, also this is kind of strange in your first error log:

File "/root/.cache/huggingface/modules/diffusers_modules/git/stable_diffusion_controlnet_img2img.py", line 839, in __call__ down_block_res_samples, mid_block_res_sample = self.controlnet(

In the code you posted you were using the StableDiffusionXLControlNetPipeline but that error shows stable_diffusion_controlnet_img2img which doesn't make sense.

You need to double check that you're using the correct pipeline and that the local files exists since you're not using the hub. The controlnet works, if you want I can post the code but it's the same as yours just that I have the correct paths.

Edit: I see now, that you're using a custom pipeline, don't use that, just use the normal pipeline.

asomoza avatar Apr 11 '24 04:04 asomoza

I'll post the code anyway so if more people encounter this problem can use it.

Also the original repo doesn't have the right filenames, I intend to do a guide with this and a hires fix equivalent, so I'll ask the model owner if he can create the files with the correct filenames at a later date.

import torch

from diffusers import ControlNetModel, DPMSolverMultistepScheduler, StableDiffusionXLControlNetPipeline
from diffusers.utils import load_image


controlnet = ControlNetModel.from_pretrained(
    "OzzyGT/SDXL_Controlnet_Tile_Realistic", torch_dtype=torch.float16, variant="fp16"
)

pipeline = StableDiffusionXLControlNetPipeline.from_pretrained(
    "SG161222/RealVisXL_V4.0",
    torch_dtype=torch.float16,
    variant="fp16",
    controlnet=controlnet,
).to("cuda")

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)

control_image = load_image(
    "https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/20240401162107_lq.png?download=true"
)


prompt = "high quality image of a car"
negative_prompt = "blurry, low quality"

image = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    guidance_scale=7.5,
    num_inference_steps=25,
    image=control_image,
    controlnet_conditioning_scale=1.0,
).images[0]

image.save("result.png")

asomoza avatar Apr 11 '24 04:04 asomoza

thank you and i will try that

xinli2008 avatar Apr 11 '24 04:04 xinli2008

what is your version of transformers ? I think my error is the version of transformers

xinli2008 avatar Apr 11 '24 04:04 xinli2008

  • diffusers version: 0.28.0.dev0
  • Platform: Linux-6.8.4-arch1-1-x86_64-with-glibc2.39
  • Python version: 3.11.8
  • PyTorch version (GPU?): 2.2.2+cu121 (True)
  • Huggingface_hub version: 0.20.3
  • Transformers version: 4.39.2
  • Accelerate version: 0.28.0
  • xFormers version: not installed

asomoza avatar Apr 11 '24 05:04 asomoza

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar May 11 '24 15:05 github-actions[bot]

follow up to this-- I'm a bit confused on how exactly the tiling works, after reading the diffusers controlnet pipeline code. There is no specification on the number of tiles/tile sizing here, so what exactly is going on with this controlnet?

AbhinavGopal avatar Jun 18 '24 22:06 AbhinavGopal

it is called controlnet tile but it actually doesn't do anything related to tiles, it just add details, change them or fixes blurry images as you can see in my example.

This kind of controlnet it's used a lot with upscaling and tiling because it's ideal for that, the number of tiles or the strategy to do it, it's all done in the code on top of it.

asomoza avatar Jun 19 '24 17:06 asomoza

Has it been resolved?

hjj-lmx avatar Jun 20 '24 10:06 hjj-lmx

Since the original user didn't post any more questions, we can assume yes.

asomoza avatar Jun 20 '24 15:06 asomoza

Closing this issue because of inactivity. Feel free to reopen.

sayakpaul avatar Jun 29 '24 13:06 sayakpaul