vision-and-language topic

List vision-and-language repositories
trafficstars

vognet-pytorch

67
Stars
7
Forks
Watchers

[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)

Pseudo-Q

139
Stars
9
Forks
Watchers

[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding

awesome-vision-language-navigation

286
Stars
17
Forks
Watchers

A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"

VL_adapter

204
Stars
16
Forks
Watchers

PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)

vilio

88
Stars
29
Forks
Watchers

🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle

LightningDOT

72
Stars
9
Forks
Watchers

source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

MIA

64
Stars
9
Forks
Watchers

Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)

multimodal

71
Stars
7
Forks
Watchers

A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"

rva

65
Stars
14
Forks
Watchers

Code for CVPR'19 "Recursive Visual Attention in Visual Dialog"