Q-Align icon indicating copy to clipboard operation
Q-Align copied to clipboard

Why do we need expand2sqaure in scorers?

Open Hoi2022 opened this issue 1 year ago • 4 comments

Hello, thanks for your great works.

However, I find that the preprocess inside this looks weird.

The input image is first expanded to a square one, with extra mean-color margins by applying the expand2sqaure function.

I visualized a preprocessed image, which looks like this:

proproc

while the original image is:

image

My teammate also reports that larger aspect ratio results in lower aesthetic score. So we are wondering if this preprocess is correct.

Hoi2022 avatar Dec 30 '24 06:12 Hoi2022

In my survey, I found that no expandsqure can promote the capicity of whole model... u can only use preprocessing from CLIP_ImageProcessor, and this one can cause real improvement.

CarlCloudWang avatar Jan 15 '25 06:01 CarlCloudWang

In my survey, I found that no expandsqure can promote the capicity of whole model... u can only use preprocessing from CLIP_ImageProcessor, and this one can cause real improvement.

hi, without expandsqure, do you directly resize to 224? I wonder whether directly resize would defect the original image.

elvindp avatar Jan 16 '25 08:01 elvindp

In my survey, I found that no expandsqure can promote the capicity of whole model... u can only use preprocessing from CLIP_ImageProcessor, and this one can cause real improvement.

hi, without expandsqure, do you directly resize to 224? I wonder whether directly resize would defect the original image.

CLIPImageProcessor includes a CenterCrop and a resize, rather than directly resizing the image to a square shape.

Hoi2022 avatar Jan 21 '25 02:01 Hoi2022

In my sight, I concern the padding part may infect the image quality in some way.

FengqiZhang0329 avatar Feb 22 '25 07:02 FengqiZhang0329