Vit-B IN21K weights

Open umarkhalidAI opened this issue 3 years ago • 4 comments

Can you please share the converted weights of IN21K that can be used to finetune for action recognition?

Oct 13 '22 20:10 umarkhalidAI

Hi,

Please find the checkpoint at https://github.com/ShoufaChen/AdaptFormer/releases/download/v0.1/vit_base_patch16_224_in21k_to_video_tz.pth

Oct 15 '22 01:10 ShoufaChen

It gives the following error when I use the above weights: File "/home/taoyang/PycharmProjects/AdaptFormer-main/util/pos_embed.py", line 122, in interpolate_pos_embed_ori pos_tokens = pos_tokens.reshape(-1, orig_size, orig_size, embedding_size).permute(0, 3, 1, 2) RuntimeError: shape '[-1, 14, 14, 768]' is invalid for input of size 151296

Oct 15 '22 13:10 umarkhalidAI

Could you give post complete log?

Oct 16 '22 01:10 ShoufaChen

I think we have to delete the following keys del checkpoint_model['cls_token'] del checkpoint_model['pos_embed']

Oct 16 '22 13:10 umarkhalidAI