vlms topic

List vlms repositories

HallusionBench

228
Stars
5
Forks
Watchers

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

ViTamin

163
Stars
5
Forks
Watchers

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

CAL

44
Stars
2
Forks
Watchers

[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment

AWT

70
Stars
1
Forks
Watchers

[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation