VideoMAEv2 issues

Code implementation of model inference

1

Thank you for your great contributions! Where could I find the implementation of model inference on a newly unseen dataset? Looking forward to your reply! Thank you.

XuecWu

Impact of Something Something and Kinetics during Unlabeled Pre-training

Hello, In your unlabeled pre-training phase, you train on two datasets (STV2 and Kinetics) with reconstruction loss and later fine-tune with loss computed against their labels. Did you find that...

edessa

fine-tuning AVA dataset for spatiotemporal detection

I wonder if the preparation of custom AVA-format dataset is the same as the VideoMAE, are the process of fine-tuning AVA format custom dataset the same as the process in...

Young-eng

Turning VideoMAEv2 into a next-frame prediction model

1

Great work and thanks for the code! I was just wondering how you see the chance that with some proper masking strategy you can do full next-frame prediction on an...

IoSonoMarco

The parameter grad_norm appears to be inf and then nan when input resolution is 112*112 during the pre-training on VIT-Small backbone

1

Hello, thank you very much for your significant contribution to the computer vision community! When I set my input resolution to 112*112 and do the pre-training on VIT-Small backbone the...

TJU-YDragonW

The model download link is broken

Great work!! Unfortunately, the link is broken:[vit_b_k710_dl_from_giant.pth](https://pjlab-gvm-data.oss-cn-shanghai.aliyuncs.com/internvideo/distill/vit_b_k710_dl_from_giant.pth). I really want to try your model and do some interesting work, hope you can fix it. Thanks

Bazinga699

on the tad features extraction, is image normalization required?

3

on the train stage，the images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224,...

auzxb

Model Weights

4

Thank you for sharing your work. I've tried to download the models stored in the server: https://pjlab-gvm-data.oss-cn-shanghai.aliyuncs.com/internvideo/distill/ It seems that they are not available anymore. Would it be possible to...

SamHSHS

Knowledge distillation code

1

Knowledge distillation: A good teacher is patient and consistent tensorflow: https://github.com/google-research/big_vision/tree/main/big_vision/configs/proj/distill Do you have plans to open source the distillation code?

rixejzvdl649

CLS token

Thanks for the code and work! Why there is no CLS token?

Alsalivan

VideoMAEv2
VideoMAEv2 copied to clipboard

Metadata

Code implementation of model inference

Impact of Something Something and Kinetics during Unlabeled Pre-training

fine-tuning AVA dataset for spatiotemporal detection

Turning VideoMAEv2 into a next-frame prediction model

The parameter grad_norm appears to be inf and then nan when input resolution is 112*112 during the pre-training on VIT-Small backbone

The model download link is broken

on the tad features extraction, is image normalization required?

Model Weights

Knowledge distillation code

CLS token

← Metadata

Owner

Metadata

VideoMAEv2 VideoMAEv2 copied to clipboard

Metadata

← Metadata

Owner

Metadata

VideoMAEv2
VideoMAEv2 copied to clipboard