vision-language-learning topic
List
vision-language-learning repositories
OPERA
265
Stars
24
Forks
Watchers
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Ovis
1.4k
Stars
83
Forks
1.4k
Watchers
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Situation3D
17
Stars
1
Forks
Watchers
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
RLAIF-V
427
Stars
19
Forks
427
Watchers
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Modality-Integration-Rate
107
Stars
2
Forks
107
Watchers
[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".