vision-and-language topic

List vision-and-language repositories

cyclical-visual-captioning

42

Stars

3

Forks

Watchers

PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

captioning-images

captioning-videos

vision-and-language

regretful-agent

123

Stars

23

Forks

Watchers

PyTorch code for CVPR 2019 paper: The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

vision-and-language

selfmonitoring-agent

117

Stars

18

Forks

Watchers

PyTorch code for ICLR 2019 paper: Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

vision-and-language

clip_playground

138

Stars

11

Forks

Watchers

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities

vision-and-language

Matterport3DSimulator

455

Stars

129

Forks

Watchers

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

matterport3d-dataset

matterport3d-simulator

natural-language-processing

reinforcement-learning

CBP

59

Stars

9

Forks

Watchers

Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"

action-localization

video-grounding

video-moment-retrieval

ClipBERT

688

Stars

85

Forks

Watchers

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

video-question-answering

video-retrieval

HERO

228

Stars

35

Forks

Watchers

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

HERO_Video_Feature_Extractor

91

Stars

14

Forks

Watchers

Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

just-ask

114

Stars

15

Forks

Watchers

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

multimodal-learning

question-generation

video-question-answering