large-vision-language-model topic

List large-vision-language-model repositories

Awesome-Multimodal-Large-Language-Models

9.7k
Stars
631
Forks
208
Watchers

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

InternLM-XComposer

1.8k
Stars
118
Forks
Watchers

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

MoE-LLaVA

1.7k
Stars
103
Forks
12
Watchers

Mixture-of-Experts for Large Vision-Language Models

Video-LLaVA

2.5k
Stars
187
Forks
Watchers

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

MMStar

110
Stars
1
Forks
Watchers

This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

thepipe

786
Stars
60
Forks
Watchers

Feed PDFs, URLs, Slides, YouTube, and more into Vision-Language models with one line of code⚡