large-vision-language-model topic

List large-vision-language-model repositories

Awesome-Multimodal-Large-Language-Models

11.9k
Stars
765
Forks
Watchers

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

InternLM-XComposer

2.5k
Stars
152
Forks
Watchers

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

MoE-LLaVA

1.9k
Stars
121
Forks
Watchers

Mixture-of-Experts for Large Vision-Language Models

Video-LLaVA

2.9k
Stars
207
Forks
Watchers

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

MMStar

199
Stars
5
Forks
199
Watchers

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

thepipe

1.1k
Stars
70
Forks
Watchers

Extract clean markdown from PDFs, URLs, Word docs, slides, videos, and more, ready for any LLM. ⚡

CARES

40
Stars
2
Forks
Watchers

[arXiv'24 & ICMLW'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

hawk

16
Stars
0
Forks
Watchers

Hawk: Learning to Understand Open-World Video Anomalies

apiprompting

21
Stars
1
Forks
Watchers

[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models