long-video-understanding topic
MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
GVL
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
LLMVA-GEBC
Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)
TESTA
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
LangRepo
Language Repository for Long Video Understanding
MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
2024-ICLR-Norton
Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]
MLVU
🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
Video-RAG-master
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
LSDBench
A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency of long-video VLMs. (ICCV2025)