visual-grounding topic
D3Net
[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
ScanRefer
[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
cyclical-visual-captioning
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
TubeDETR
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
SeqTR
SeqTR: A Simple yet Universal Network for Visual Grounding
awesome-grounding
awesome grounding: A curated list of research papers in visual grounding
PhraseCutDataset
Dataset API for "PhraseCut: Language-based Image Segmentation in the Wild"
vognet-pytorch
[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)
Pseudo-Q
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
vRGV
Visual Relation Grounding in Videos (ECCV'20, Spotlight)