vision-and-language topic

List vision-and-language repositories

hateful_memes-hate_detectron

53
Stars
19
Forks
Watchers

Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge. https://arxiv.org/abs/2012.12975

VLCAP

28
Stars
5
Forks
Watchers

[ICIP 2022] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning

open-fashion-clip

48
Stars
5
Forks
Watchers

This is the official repository for the paper "OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data". ICIAP 2023

MAC

23
Stars
0
Forks
Watchers

An end-to-end masked contrastive video-and-language pre-training framework

DRFT

18
Stars
0
Forks
Watchers

End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021

A curated list of awesome vision and language resources for earth observation.

RS5M

168
Stars
6
Forks
Watchers

RS5M: a large-scale vision language dataset for remote sensing

Cross-Modal-Adapter

51
Stars
2
Forks
Watchers

[arXiv] Cross-Modal Adapter for Text-Video Retrieval

VidChapters

175
Stars
20
Forks
Watchers

[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale

EDA

96
Stars
4
Forks
Watchers

[CVPR 2023] EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding