TimeSformer-pytorch icon indicating copy to clipboard operation
TimeSformer-pytorch copied to clipboard

Imagenet Pretrained Weights

Open RaivoKoot opened this issue 3 years ago • 5 comments

Thanks for the work! In their paper they say For all our experiments, we adopt the “Base” ViT model architecture (Dosovitskiy et al., 2020) pretrained on ImageNet.

I know that you said the official weights trained on kinetics and such are not officially released yet. However, I am not interested in those but am actually in need of the initial weights of the network just based on ViT Imagenet pretraining. I need to train this implementation of yours starting from those. From what it looks like, you don't have weights for this implementation that come from imagenet pretraining, do you?

RaivoKoot avatar Apr 01 '21 20:04 RaivoKoot

+1

jwohlwend avatar Apr 05 '21 02:04 jwohlwend

+1

MohamedOsman1998 avatar Apr 12 '21 23:04 MohamedOsman1998

I am not sure it is possible, but ViT weights are available on timm and others.

tcapelle avatar Apr 21 '21 09:04 tcapelle

For now, I found this repository that includes ViT initialization (https://github.com/m-bain/video-transformers).

RaivoKoot avatar Apr 21 '21 10:04 RaivoKoot

You can even download a pretrained ViT from huggingface: https://huggingface.co/google/vit-large-patch16-224

tcapelle avatar Apr 21 '21 11:04 tcapelle