vision-and-language topic

List vision-and-language repositories

cyclical-visual-captioning

42
Stars
3
Forks
Watchers

PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

regretful-agent

123
Stars
23
Forks
Watchers

PyTorch code for CVPR 2019 paper: The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

selfmonitoring-agent

117
Stars
18
Forks
Watchers

PyTorch code for ICLR 2019 paper: Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

clip_playground

138
Stars
11
Forks
Watchers

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities

Matterport3DSimulator

455
Stars
129
Forks
Watchers

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

CBP

59
Stars
9
Forks
Watchers

Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"

ClipBERT

688
Stars
85
Forks
Watchers

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

HERO

228
Stars
35
Forks
Watchers

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

HERO_Video_Feature_Extractor

91
Stars
14
Forks
Watchers

Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

just-ask

114
Stars
15
Forks
Watchers

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos