BLINK_Benchmark
BLINK_Benchmark copied to clipboard
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12390 [ECCV 2024]
Hi, Great and excellent work! I was wondering if it is possible to release the raw images without markers and also the coordinates for the markers and labels?
Very helpful research, great worI wanted to express my appreciation for the excellent work your team has done in contributing significantly to the evaluation of visual language models. Your paper...

Hi, Thanks for the nice work! How do you feed 3 images to LLaVA for the visual similarity task? Thanks, Sara