DeCLIP icon indicating copy to clipboard operation
DeCLIP copied to clipboard

Performance of Declip-88M checkpoint

Open Hcyang-NULL opened this issue 3 years ago • 4 comments

Hi, I want to reproduce the zero-shot result of DeClip-88M under ResNet50 in ImageNet-1K (whose performance is 62.5 in the table). But the evaluation result I got is 7.264 which is too low. But the result of ViT-B32 is correct. And I found a problem during loading the ResNet50 checkpoint:

size mismatch for module.logit_scale: copying a param with shape torch.Size([]) from checkpoint, the shape in current model is torch.Size([1]).

I didn't change any code of the model.

Another question is that why run.sh of declip-88m-resnet50 uses clip_solver while other run.sh files use declip_solver? I use declip_solver to do the evaluation for DeClip-88M-ResNet50 by replacing the yaml file. The following figure is the results reproduced on my own compute resources: image

Do you have any ideas? Thanks!

Hcyang-NULL avatar Jul 24 '22 14:07 Hcyang-NULL

Hi @Hcyang-NULL , were you able to figure out the issue? cc: @zlccccc

AadSah avatar Sep 12 '22 18:09 AadSah

size mismatch for module.logit_scale: copying a param with shape torch.Size([]) from checkpoint, the shape in current model is torch.Size([1]).

This problem is because the saved models come from different torch versions. You can forcibly convert logit_scale to torch.size([]) or torch.size([1]) when loading the model, which will not affect the accuracy.

zlccccc avatar Sep 13 '22 03:09 zlccccc

Thanks for your reply!

I have tried this method before (forcibly reshape the logit_scale). It doesn't work, the performance is still 7.264. But the result of Vit is indeed correct, maybe the checkpoint of resnet50 is inconsistent with the code version? (I guess)

Hcyang-NULL avatar Sep 13 '22 06:09 Hcyang-NULL

Excuse me, Could you tell me Where can I find the file named 'val_official.json'? @Hcyang-NULL

fuchun-wang avatar Nov 22 '22 08:11 fuchun-wang