CLIP icon indicating copy to clipboard operation
CLIP copied to clipboard

Accpetable Size of Images for CLIP

Open Calmepro777 opened this issue 2 years ago • 8 comments

I wonder if CLIP accepts image with size smaller than 224 * 224 (e.g. 32 * 32). I noticed that an error regarding matrices dimentions will occur if I use 32 * 32 images as the input of CLIP without modifying the model.

Calmepro777 avatar May 30 '22 01:05 Calmepro777

Did you find a solution? I am reshaping all images to 224X224, but that seems a bit fishy, especially with the varying aspect ratio.

tillaczel avatar May 31 '22 13:05 tillaczel

Just exploring. Still in exploring mode.

Best, N

Get Outlook for iOShttps://aka.ms/o0ukef


From: Till Aczél @.> Sent: Tuesday, May 31, 2022 4:39:08 PM To: openai/CLIP @.> Cc: Subscribed @.***> Subject: Re: [openai/CLIP] Accpetable Size of Images for CLIP (Issue #248)

Did you find a solution? I am reshaping all images to 224X224, but that seems a bit fishy, especially with the varying aspect ratio.

— Reply to this email directly, view it on GitHubhttps://github.com/openai/CLIP/issues/248#issuecomment-1142149306, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AYX372QQQFS4J7BYUITNRETVMYI7ZANCNFSM5XJAC74Q. You are receiving this because you are subscribed to this thread.Message ID: @.***>

ntushaar92 avatar May 31 '22 16:05 ntushaar92

I found my mistake, I forgot to preprocess the image

tillaczel avatar May 31 '22 22:05 tillaczel

Did you find a solution? I am reshaping all images to 224X224, but that seems a bit fishy, especially with the varying aspect ratio. That was how I tackle with images with smaller size. I expect that by slightly modifying the model without retraining, CLIP could take in images with smaller sizes.

Calmepro777 avatar Jun 01 '22 01:06 Calmepro777

I haven't tried it, but probably you would have to pass a different argument to CLIPFeatureExtractor.

https://huggingface.co/docs/transformers/v4.19.2/en/model_doc/clip#transformers.CLIPFeatureExtractor

alexfilothodoros avatar Jun 01 '22 13:06 alexfilothodoros

怎么把图片的分辨率从224改成32,因为我使用的是cifar10的数据集,他这个数据集就是32乘以32的分辨率。

gqingc avatar Feb 27 '23 04:02 gqingc

You should upsample the image to (224, 224). ResNet performs 32x downsampling, and ViT also needs fixed-size input to patchify. So smaller images will cause problems here and there

Charles-Lu avatar Apr 13 '23 19:04 Charles-Lu

Same problem, my server is too weak to handle 224x224 images

trinh-hoang-hiep avatar Apr 11 '24 02:04 trinh-hoang-hiep