visual-instruction-tuning topic
List
visual-instruction-tuning repositories
Awesome-Multimodal-Large-Language-Models
11.9k
Stars
765
Forks
208
Watchers
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Osprey
756
Stars
43
Forks
Watchers
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
polite-flamingo
63
Stars
3
Forks
Watchers
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
VideoTGB
22
Stars
1
Forks
Watchers
[EMNLP 2024] A Video Chat Agent with Temporal Prior
DataOptim
74
Stars
3
Forks
Watchers
A collection of visual instruction tuning datasets.
lmms-finetune
166
Stars
21
Forks
Watchers
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, qwen-vl, qwen2-vl, phi3-v etc.