vision-language topic
VAST
Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
VLN-BEVBert
[ICCV 2023} Official repo of "BEVBert: Multimodal Map Pre-training for Language-guided Navigation"
STALE
[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "
MCM
PyTorch implementation of MCM (Delving into out-of-distribution detection with vision-language representations), NeurIPS 2022
CG-VLM
This is the official repo for Contrastive Vision-Language Alignment Makes Efficient Instruction Learner.
PODA
[ICCV 2023] Official implementation of "PØDA: Prompt-driven Zero-shot Domain Adaptation"
CoTConsistency
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
Clip2Protect
[CVPR 2023] Official repository of paper titled "CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search".
RemoteCLIP
🛰️ Official repository of paper "RemoteCLIP: A Vision Language Foundation Model for Remote Sensing" (IEEE TGRS)
Awesome-RSITR
🎮 A Benchmark and Awesome Collection of Methods for Remote Sensing Image-Text Retrieval (RSITR)| Remote Sensing Cross-Model Retrieval (RSCMR) | Remote Sensing Vision-Lanuage Models (RSVLMs)