large-multimodal-models topic

List large-multimodal-models repositories

VisualWebBench

41
Stars
1
Forks
Watchers

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

TextCoT

30
Stars
3
Forks
Watchers

The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.

MileBench

23
Stars
1
Forks
Watchers

This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"

Open-LLaVA-NeXT

247
Stars
10
Forks
Watchers

An open-source implementation for training LLaVA-NeXT.

IVM

21
Stars
2
Forks
Watchers

[NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"

OPERA

265
Stars
24
Forks
Watchers

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

ShareGPT4Video

1.2k
Stars
44
Forks
Watchers

[NeurIPS 2024 D&B Track] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

ShareGPT4V

124
Stars
4
Forks
Watchers

[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

vhs_benchmark

20
Stars
1
Forks
Watchers

🔥 Official Benchmark Toolkits for "Visual Haystacks: Answering Harder Questions About Sets of Images"