large-vision-language-models topic

List large-vision-language-models repositories

Awesome_Matching_Pretraining_Transfering

397
Stars
47
Forks
Watchers

The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

Awesome-Multimodal-Large-Language-Models

11.9k
Stars
765
Forks
208
Watchers

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

HallusionBench

228
Stars
5
Forks
Watchers

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

ReForm-Eval

32
Stars
4
Forks
Watchers

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

talk2bev

93
Stars
9
Forks
Watchers

Talk2BEV: Language-Enhanced Bird's Eye View Maps (Accepted to ICRA'24)

CHOCOLATE

23
Stars
0
Forks
Watchers

Code and data for the ACL 2024 Findings paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning"

MMStar

144
Stars
5
Forks
Watchers

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

LLaVA-Align

70
Stars
2
Forks
Watchers

This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.

DoRA

591
Stars
36
Forks
Watchers

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

VLGuard

37
Stars
0
Forks
Watchers

[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.