large-multimodal-models topic

List large-multimodal-models repositories

awesome-vision-time-series

54
Stars
2
Forks
54
Watchers

This is an official repository for "Harnessing Vision Models for Time Series Analysis: A Survey".

LLaVA-STF

29
Stars
2
Forks
29
Watchers

The official implementation of "Learning Compact Vision Tokens for Efficient Large Multimodal Models"

awesome-personalized-lmms

47
Stars
0
Forks
47
Watchers

A curated list of Awesome Personalized Large Multimodal Models resources

Stream-Omni

358
Stars
39
Forks
358
Watchers

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.

Modality-Integration-Rate

107
Stars
2
Forks
107
Watchers

[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".

LLaVA-Mini

546
Stars
28
Forks
546
Watchers

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

GUI-R1

207
Stars
16
Forks
207
Watchers

Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents