large-vision-language-model topic
List
large-vision-language-model repositories
Awesome-Multimodal-Large-Language-Models
9.7k
Stars
631
Forks
208
Watchers
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
InternLM-XComposer
1.8k
Stars
118
Forks
Watchers
InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
MoE-LLaVA
1.7k
Stars
103
Forks
12
Watchers
Mixture-of-Experts for Large Vision-Language Models
Video-LLaVA
2.5k
Stars
187
Forks
Watchers
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
MMStar
110
Stars
1
Forks
Watchers
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
thepipe
786
Stars
60
Forks
Watchers
Feed PDFs, URLs, Slides, YouTube, and more into Vision-Language models with one line of code⚡