llava topic

List llava repositories

LLaVA

17.1k
Stars
1.8k
Forks
135
Watchers

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

multimodal-maestro

963
Stars
68
Forks
Watchers

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

awesome-foundation-and-multimodal-models

520
Stars
39
Forks
Watchers

👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

LRV-Instruction

225
Stars
14
Forks
Watchers

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

HallusionBench

185
Stars
2
Forks
Watchers

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Video-ChatGPT

983
Stars
87
Forks
Watchers

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for...

VLMEvalKit

501
Stars
59
Forks
Watchers

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks

LLaVAR

240
Stars
11
Forks
Watchers

Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"

ViP-LLaVA

173
Stars
10
Forks
Watchers

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

llava-docker

68
Stars
12
Forks
Watchers

Docker image for LLaVA: Large Language and Vision Assistant