large-vision-language-models topic

List large-vision-language-models repositories

Awesome_Matching_Pretraining_Transfering

397

Stars

47

Forks

Watchers

The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

cross-modal-retrieval

image-retrieval

Awesome-Multimodal-Large-Language-Models

11.9k

Stars

765

Forks

208

Watchers

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

chain-of-thought

in-context-learning

instruction-following

instruction-tuning

HallusionBench

228

Stars

5

Forks

Watchers

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

ReForm-Eval

32

Stars

4

Forks

Watchers

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

in-context-learning

talk2bev

93

Stars

9

Forks

Watchers

Talk2BEV: Language-Enhanced Bird's Eye View Maps (Accepted to ICRA'24)

autonomous-driving

large-language-models

CHOCOLATE

23

Stars

0

Forks

Watchers

Code and data for the ACL 2024 Findings paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning"

chart-captioning

chart-summarization

chart-understanding

MMStar

144

Stars

5

Forks

Watchers

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

MMStar-Benchmark

large-language-models

large-multimodal-models

large-vision-language-model

LLaVA-Align

70

Stars

2

Forks

Watchers

This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.

large-vision-language-models

DoRA

591

Stars

36

Forks

Watchers

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

commonsense-reasoning

deep-neural-networks

instruction-tuning

VLGuard

37

Stars

0

Forks

Watchers

[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.

large-language-models

large-vision-language-models