visual-instruction-tuning topic

List visual-instruction-tuning repositories

Awesome-Multimodal-Large-Language-Models

11.9k
Stars
765
Forks
208
Watchers

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Osprey

756
Stars
43
Forks
Watchers

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

polite-flamingo

63
Stars
3
Forks
Watchers

🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)

VideoTGB

22
Stars
1
Forks
Watchers

[EMNLP 2024] A Video Chat Agent with Temporal Prior

DataOptim

74
Stars
3
Forks
Watchers

A collection of visual instruction tuning datasets.

lmms-finetune

166
Stars
21
Forks
Watchers

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, qwen-vl, qwen2-vl, phi3-v etc.