controlnet_aux icon indicating copy to clipboard operation
controlnet_aux copied to clipboard

Image resize issue

Open DamienGaullier opened this issue 1 year ago • 6 comments

Hi, I have an issue with my output image which does not have the same dimensions as the input one. Original image: 640x480 Output image: 576x512 Do you know why ?

here the sample :

import numpy

from controlnet_aux.processor import MidasDetector
from PIL import Image

image_path = "d:/temp/cat.png"
model_path = "d:/model/dreamshaper-v8.0"

convertedImage = Image.open(image_path)
resolution = min(convertedImage.size)

processor = MidasDetector.from_pretrained(model_path)
controlnet_image = processor(
    input_image=convertedImage,
    a=numpy.pi * 2.0,
    bg_th=0.1,
    depth_and_normal=False,
    detect_resolution=resolution,
    image_resolution=resolution,
    output_type='pil'
)

controlnet_image.save("d:/temp/cat_result.png") 

cat cat_result

Thank you!

DamienGaullier avatar Nov 21 '23 13:11 DamienGaullier

same issue

SlZeroth avatar Jan 10 '24 09:01 SlZeroth

Could you maybe open a PR to fix it? More than happy to review & merge

patrickvonplaten avatar Jan 11 '24 13:01 patrickvonplaten

@patrickvonplaten Do we want to "fix" this ?

I was checking the issue:
This cut is coming from the util method resize_image, directly coming from illasviel Controlnet util resize_image. It happens when the resize is requested on a resolution which is not a multiple of 64.
From what I see this is linked to the model convolution size (to confirm ?)

While it works to generate images with different size. I guess it is not optimal in this case the better solution is to resize the image to 512 (64*8) for instance.

We could:

  • Add some documentation. Which I tried to do in https://github.com/patrickvonplaten/controlnet_aux/pull/89
  • Make the resolutions parameter optional and not resize if None (I actually hesitated with this, cf MR )
  • Raise an exception when resolution is not a multiple of 64
  • Accept custom resolutions (just need to remove the line above-mentioned) but would probably be better done in ControlNet library also

What do you wish to go for ?

lerignoux avatar Jan 15 '24 13:01 lerignoux

I'm actually happy to accept custom resolution. Diffusers unets can work with any multiple of 8

patrickvonplaten avatar Jan 15 '24 14:01 patrickvonplaten

Hey

Ok I tested a bit (allowing a None resolution to bypass resizing).
It seems it only works if image sizes are multiple of 32.
I tried to go lower but always ended up on a RuntimeError.

I would then suggest to fix it like this: cf MR

@patrickvonplaten tell me if this sounds ok for you.

I see there were no unit tests on the project. Do you think it is worth adding one for this ?

lerignoux avatar Jan 16 '24 06:01 lerignoux

Hello @patrickvonplaten Do you have an issue with the above-mentioned suggestion ?

Tell us if you see anything you would like to add.

Regards

lerignoux avatar Jan 23 '24 07:01 lerignoux