vision-language-models topic
GPA-LM
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
MyVLM
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
DenseFusion
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
EVE
EVE Series: Encoder-Free Vision-Language Models from BAAI
Awesome-LVLM-Hallucination
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
apiprompting
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
CPL-ICML2024
[ICML 2024] Offical code repo for ICML2024 paper "Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data"
FAPrompt
Official PyTorch implementation of ICCV'25 paper "Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection".
JarvisArt
[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
4KAgent
[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!