grounding topic
Grounding_LLMs_with_online_RL
We perform functional grounding of LLMs' knowledge in BabyAI-Text
hulc2
[ICRA2023] Grounding Language with Visual Affordances over Unstructured Data
lumos
Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"
Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
DUET
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
LAR-Look-Around-and-Refer
This is the official implementation for our paper;"LAR:Look Around and Refer".
Cradle
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation,...
Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
CLIP-VG
[TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.
SelfEQ
[CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".