diffusers
diffusers copied to clipboard
how to use the controlnet sdxl tile model in diffusers
Describe the bug
I want to use this model to make my slightly blurry photos clear, so i found this model. I follow the code here , but as the model mentioned above is XL not 1.5 , so i change the code, but it error.
Reproduction
import torch from PIL import Image from diffusers import ControlNetModel, DiffusionPipeline, StableDiffusionXLControlNetPipeline
def resize_for_condition_image(input_image: Image, resolution: int): input_image = input_image.convert("RGB") W, H = input_image.size k = float(resolution) / min(H, W) H *= k W *= k H = int(round(H / 64.0)) * 64 W = int(round(W / 64.0)) * 64 img = input_image.resize((W, H), resample=Image.LANCZOS) return img
controlnet = ControlNetModel.from_pretrained('/mnt/asian-t2i/pretrained_models/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1', torch_dtype=torch.float16, use_safetensors = True)
pipe = DiffusionPipeline.from_pretrained("/mnt/asian-t2i/pretrained_models/RealVisXL_V3.0", custom_pipeline="stable_diffusion_controlnet_img2img", controlnet=controlnet, torch_dtype=torch.float16,).to('cuda')
pipe.enable_xformers_memory_efficient_attention()
source_image = Image.open("/mnt/asian-t2i/data/luchuan/1024/0410-redbook-luchuan-6.jpg")
condition_image = resize_for_condition_image(source_image, 1024)
image = pipe( prompt="best quality", negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", image=condition_image, controlnet_conditioning_image=condition_image, width=condition_image.size[0], height=condition_image.size[1], strength=1.0, generator=torch.manual_seed(0), num_inference_steps=32, ).images[0]
image.save('output.png')
Logs
/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:678: FutureWarning: 'cached_download' is the legacy way to download files from the HF hub, please consider upgrading to 'hf_hub_download'
warnings.warn(
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00, 2.00it/s]
You have disabled the safety checker for <class 'diffusers_modules.git.stable_diffusion_controlnet_img2img.StableDiffusionControlNetImg2ImgPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
0%| | 0/32 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/mnt/asian-t2i/demo.py", line 31, in <module>
image = pipe(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/.cache/huggingface/modules/diffusers_modules/git/stable_diffusion_controlnet_img2img.py", line 839, in __call__
down_block_res_samples, mid_block_res_sample = self.controlnet(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/asian-t2i/diffusers/src/diffusers/models/controlnet.py", line 775, in forward
if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable
System Info
Name: diffusers Version: 0.27.0.dev0
Who can help?
@sayakpaul @yiyixuxu @DN6
please give me a favor if you have free time
Cc: @asomoza any tips?
Why are you not using https://github.com/huggingface/diffusers/blob/aa1f00fd0182baf22800e27ccd9a55016e1eb4b4/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl_img2img.py#L160
directly?
code:
import torch from PIL import Image from diffusers import ControlNetModel, DiffusionPipeline, StableDiffusionXLControlNetPipeline
def resize_for_condition_image(input_image: Image, resolution: int): input_image = input_image.convert("RGB") W, H = input_image.size k = float(resolution) / min(H, W) H *= k W *= k H = int(round(H / 64.0)) * 64 W = int(round(W / 64.0)) * 64 img = input_image.resize((W, H), resample=Image.LANCZOS) return img
controlnet = ControlNetModel.from_pretrained('/mnt/asian-t2i/pretrained_models/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1', torch_dtype=torch.float16, use_safetensors = True)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained( "/mnt/asian-t2i/pretrained_models/RealVisXL_V3.0", controlnet=controlnet, local_files_only = True, torch_dtype=torch.float16, ) pipe.enable_model_cpu_offload()
source_image = Image.open("/mnt/asian-t2i/data/luchuan/链接1-裁剪-1024/0410-redbook-luchuan-6.jpg") condition_image = resize_for_condition_image(source_image, 1024)
images = pipe( "best quality", negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", image=condition_image, controlnet_conditioning_image=condition_image, ).images
issues: root@67def1b0e1c0:/mnt/asian-t2i# python3 demo.py
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00, 3.91it/s]
Traceback (most recent call last):
File "/mnt/asian-t2i/demo.py", line 33, in
We don't have access to the model checkpoints you're referring to here. In particular: pretrained_models/RealVisXL_V3.0
.
it is just a sdxl-model ,you can use it with : here
Hi, since you're saying to use any SDXL model, I used this one: https://huggingface.co/SG161222/RealVisXL_V4.0 and using the StableDiffusionXLControlNetPipeline
Didn't have any errors or problems:
original | generated |
---|---|
From what I can see in your error log the problem is with your local files, also this is kind of strange in your first error log:
File "/root/.cache/huggingface/modules/diffusers_modules/git/stable_diffusion_controlnet_img2img.py", line 839, in __call__ down_block_res_samples, mid_block_res_sample = self.controlnet(
In the code you posted you were using the StableDiffusionXLControlNetPipeline
but that error shows stable_diffusion_controlnet_img2img
which doesn't make sense.
You need to double check that you're using the correct pipeline and that the local files exists since you're not using the hub. The controlnet works, if you want I can post the code but it's the same as yours just that I have the correct paths.
Edit: I see now, that you're using a custom pipeline, don't use that, just use the normal pipeline.
I'll post the code anyway so if more people encounter this problem can use it.
Also the original repo doesn't have the right filenames, I intend to do a guide with this and a hires fix equivalent, so I'll ask the model owner if he can create the files with the correct filenames at a later date.
import torch
from diffusers import ControlNetModel, DPMSolverMultistepScheduler, StableDiffusionXLControlNetPipeline
from diffusers.utils import load_image
controlnet = ControlNetModel.from_pretrained(
"OzzyGT/SDXL_Controlnet_Tile_Realistic", torch_dtype=torch.float16, variant="fp16"
)
pipeline = StableDiffusionXLControlNetPipeline.from_pretrained(
"SG161222/RealVisXL_V4.0",
torch_dtype=torch.float16,
variant="fp16",
controlnet=controlnet,
).to("cuda")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=True)
control_image = load_image(
"https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/20240401162107_lq.png?download=true"
)
prompt = "high quality image of a car"
negative_prompt = "blurry, low quality"
image = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
guidance_scale=7.5,
num_inference_steps=25,
image=control_image,
controlnet_conditioning_scale=1.0,
).images[0]
image.save("result.png")
thank you and i will try that
what is your version of transformers ? I think my error is the version of transformers
-
diffusers
version: 0.28.0.dev0 - Platform: Linux-6.8.4-arch1-1-x86_64-with-glibc2.39
- Python version: 3.11.8
- PyTorch version (GPU?): 2.2.2+cu121 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.39.2
- Accelerate version: 0.28.0
- xFormers version: not installed
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
follow up to this-- I'm a bit confused on how exactly the tiling works, after reading the diffusers controlnet pipeline code. There is no specification on the number of tiles/tile sizing here, so what exactly is going on with this controlnet?
it is called controlnet tile but it actually doesn't do anything related to tiles, it just add details, change them or fixes blurry images as you can see in my example.
This kind of controlnet it's used a lot with upscaling and tiling because it's ideal for that, the number of tiles or the strategy to do it, it's all done in the code on top of it.
Has it been resolved?
Since the original user didn't post any more questions, we can assume yes.
Closing this issue because of inactivity. Feel free to reopen.