AdaptFormer
AdaptFormer copied to clipboard
Vit-B IN21K weights
Can you please share the converted weights of IN21K that can be used to finetune for action recognition?
Hi,
Please find the checkpoint at https://github.com/ShoufaChen/AdaptFormer/releases/download/v0.1/vit_base_patch16_224_in21k_to_video_tz.pth
It gives the following error when I use the above weights: File "/home/taoyang/PycharmProjects/AdaptFormer-main/util/pos_embed.py", line 122, in interpolate_pos_embed_ori pos_tokens = pos_tokens.reshape(-1, orig_size, orig_size, embedding_size).permute(0, 3, 1, 2) RuntimeError: shape '[-1, 14, 14, 768]' is invalid for input of size 151296
Could you give post complete log?
I think we have to delete the following keys del checkpoint_model['cls_token'] del checkpoint_model['pos_embed']