Wenhao Wu
Wenhao Wu
MVFNet
【AAAI'2021】MVFNet: Multi-View Fusion Network for Efficient Video Recognition
DSANet
【ACMMM'2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Text4Vis
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
BIKE
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
ATM
【ICCV'2023】What Can Simple Arithmetic Operations Do for Temporal Modeling?
Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
GPT4Vis
GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?
FreeVA
FreeVA: Offline MLLM as Training-Free Video Assistant