VideoMAEv2
VideoMAEv2 copied to clipboard
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Thank you for your great contributions! Where could I find the implementation of model inference on a newly unseen dataset? Looking forward to your reply! Thank you.
Hello, In your unlabeled pre-training phase, you train on two datasets (STV2 and Kinetics) with reconstruction loss and later fine-tune with loss computed against their labels. Did you find that...
I wonder if the preparation of custom AVA-format dataset is the same as the VideoMAE, are the process of fine-tuning AVA format custom dataset the same as the process in...
Great work and thanks for the code! I was just wondering how you see the chance that with some proper masking strategy you can do full next-frame prediction on an...
Hello, thank you very much for your significant contribution to the computer vision community! When I set my input resolution to 112*112 and do the pre-training on VIT-Small backbone the...
Great work!! Unfortunately, the link is broken:[vit_b_k710_dl_from_giant.pth](https://pjlab-gvm-data.oss-cn-shanghai.aliyuncs.com/internvideo/distill/vit_b_k710_dl_from_giant.pth). I really want to try your model and do some interesting work, hope you can fix it. Thanks
on the train stage,the images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224,...
Thank you for sharing your work. I've tried to download the models stored in the server: https://pjlab-gvm-data.oss-cn-shanghai.aliyuncs.com/internvideo/distill/ It seems that they are not available anymore. Would it be possible to...
Knowledge distillation: A good teacher is patient and consistent tensorflow: https://github.com/google-research/big_vision/tree/main/big_vision/configs/proj/distill Do you have plans to open source the distillation code?
Thanks for the code and work! Why there is no CLS token?