vision-and-language topic

List vision-and-language repositories

pytorch_sscr

23
Stars
5
Forks
Watchers

A PyTorch implementation of SSCR

HiREST

90
Stars
9
Forks
Watchers

Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)

FactualSceneGraph

85
Stars
12
Forks
Watchers

FACTUAL benchmark dataset, the pre-trained textual scene graph parser trained on FACTUAL.

x-lxmert

50
Stars
10
Forks
Watchers

PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"

PartGlot

30
Stars
4
Forks
Watchers

Official Implementation of PartGlot (CVPR 2022 Oral)

lang2seg

30
Stars
8
Forks
Watchers

Referring Expression Object Segmentation with Caption-Aware Consistency, BMVC 2019

TSGV-Learning-List

31
Stars
3
Forks
Watchers

Temporal Sentence Grounding in Videos / Natural Language Video Localization / Video Moment Retrieval的相关工作

GroundVLP

33
Stars
2
Forks
Watchers

GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)