vision-language-models topic
GPA-LM
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
MyVLM
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries"
DenseFusion
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
EVE
EVE: Encoder-Free Vision-Language Models
Awesome-LVLM-Hallucination
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
apiprompting
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
CPL-ICML2024
[ICML 2024] Offical code repo for ICML2024 paper "Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data"