video-language-pretraining topic

List video-language-pretraining repositories

Video-LLaMA

2.7k
Stars
242
Forks
15
Watchers

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Shot2Story

92
Stars
6
Forks
Watchers

A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.

2024-ICLR-Norton

108
Stars
8
Forks
Watchers

Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]

VideoLLaMB

35
Stars
0
Forks
Watchers

Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges