vision-language-models topic

List vision-language-models repositories

GPA-LM

158
Stars
7
Forks
158
Watchers

This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".

MyVLM

181
Stars
11
Forks
181
Watchers

Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)

DenseFusion

157
Stars
1
Forks
157
Watchers

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

EVE

361
Stars
12
Forks
361
Watchers

EVE Series: Encoder-Free Vision-Language Models from BAAI

Awesome-LVLM-Hallucination

237
Stars
9
Forks
237
Watchers

up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources

apiprompting

106
Stars
6
Forks
106
Watchers

[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models

CPL-ICML2024

31
Stars
3
Forks
31
Watchers

[ICML 2024] Offical code repo for ICML2024 paper "Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data"

FAPrompt

24
Stars
2
Forks
24
Watchers

Official PyTorch implementation of ICCV'25 paper "Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection".

JarvisArt

682
Stars
23
Forks
682
Watchers

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

4KAgent

701
Stars
37
Forks
701
Watchers

[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!