Vision and Language Group@ MIL
Results
10
repositories owned by
Vision and Language Group@ MIL
mcan-vqa
432
Stars
88
Forks
Watchers
Deep Modular Co-Attention Networks for Visual Question Answering
openvqa
309
Stars
64
Forks
Watchers
A lightweight, scalable, and general framework for visual question answering research
bottom-up-attention.pytorch
289
Stars
74
Forks
Watchers
A PyTorch reimplementation of bottom-up-attention models
activitynet-qa
55
Stars
9
Forks
Watchers
An VideoQA dataset based on the videos from ActivityNet
mt-captioning
24
Stars
7
Forks
Watchers
A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning
rosita
55
Stars
13
Forks
Watchers
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
prophet
261
Stars
27
Forks
Watchers
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".