vision-transformer-pytorch 您好，请问怎么把这个用于视频分类？比如5分钟的短视频

您好，请问怎么把这个用于视频分类？比如5分钟的短视频

Open dotsonliu opened this issue 4 years ago • 4 comments

Feb 19 '21 14:02 dotsonliu

Thanks for your question. We haven't applied the model to video classification. However, you could use the ViT as a base model to encode each frame of your video.

Feb 23 '21 09:02 christy-yuan-li

when there has hundreds of frames ,How to deal with it?

Feb 23 '21 10:02 dotsonliu

Thank you for your question. The problem of how to efficiently process videos is interesting, but not the focus of this repo. We are happy to discuss this potential application with you, but maybe at some other venue. I would suggest you to check related literature first. My previous response was mainly to convey the idea that ViT can be used for processing images in general.

Feb 23 '21 11:02 christy-yuan-li

Directly using VIT to process video may result in bad results.

Mar 18 '21 07:03 runningJ

vision-transformer-pytorch vision-transformer-pytorch copied to clipboard

您好，请问怎么把这个用于视频分类？比如5分钟的短视频

vision-transformer-pytorch
vision-transformer-pytorch copied to clipboard