How to eval on Emu edit benchmark

Open Eureka-Maggie opened this issue 1 year ago • 2 comments

Hi,

Noticed that there are known issues with the Emu edit benchmark: some image-caption pairs seem incorrect (e.g., 'a train station in city') or identical source and target captions. So I was wondering how to calculate clip_dir metric. How did you process the benchmark dataset?

Looking forward to your reply.

Dec 15 '24 16:12 Eureka-Maggie

Hi! I also have the same question. Can the authors provide more details of the evaluation on emu edit test?

Dec 15 '24 17:12 wusize

Hi, sorry for not responding to you promptly. We did not process the captions; and we used the original benchmark without any modification.

Jan 27 '25 08:01 AFeng-x