cross-modal-retrieval topic

List cross-modal-retrieval repositories

clip-as-service

12.2k
Stars
2.1k
Forks
Watchers

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

xmodaler

1.0k
Stars
112
Forks
Watchers

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense r...

pvse

131
Stars
24
Forks
Watchers

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)

Awesome_Matching_Pretraining_Transfering

357
Stars
47
Forks
Watchers

The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

SGRAF

200
Stars
37
Forks
Watchers

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

vse_infty

149
Stars
18
Forks
Watchers

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

muscall

98
Stars
8
Forks
Watchers

Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)

VLDeformer

26
Stars
3
Forks
Watchers

Pytorch implement of the paper "VLDeformer: Vision Language Decomposed Transformer for Fast Cross-modal Retrieval", KBS 2022

TextReID

42
Stars
5
Forks
Watchers

[BMVC 2021] Text-Based Person Search with Limited Data

objects-that-sound

32
Stars
4
Forks
Watchers

The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.