video-text-retrieval topics

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insigh...

Paranioar

awesome

awesome-list

cross-modal-retrieval

image-retrieval

ALPRO

184

Stars

18

Forks

Watchers

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

salesforce

prompt-learning

representation-learning

video-language

video-question-answering

CondensedMovies

151

Stars

26

Forks

Watchers

Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]

m-bain

dataset

precomputed-features

retrieval

source-videos

crossmodal-contrastive-learning

56

Stars

11

Forks

Watchers

CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021

amazon-science

computer-vision

contrastive-learning

multi-modality

natural-language-processing

MAC

23

Stars

0

Forks

Watchers

An end-to-end masked contrastive video-and-language pre-training framework

shufangxun

activitynet

clip

contrastive-learning

didemo

Cross-Modal-Adapter

51

Stars

2

Forks

Watchers

[arXiv] Cross-Modal Adapter for Text-Video Retrieval

LeapLabTHU

adapter

clip

deep-learning

machine-learning

Cap4Video

222

Stars

16

Forks

Watchers

【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

whwu95

cross-modal-learning

video-language-understanding

video-text-retrieval

video-understanding

TESTA

49

Stars

3

Forks

Watchers

[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding

RenShuhuai-Andy

long-video-understanding

video-qa

video-text-retrieval

video-understanding