LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

[Question] TextVQA’s OCR

Open bruceisme opened this issue 1 year ago • 1 comments

Question

In ./playground/data/eval/textvqa/llava_textvqa_val_v051_ocr.jsonl, the "text" part of each piece of data contains the Reference OCR token content. May I ask where this part of OCR is obtained from?

bruceisme avatar Jan 08 '24 03:01 bruceisme

Hi! I also have the same question.

pipilurj avatar Feb 02 '24 10:02 pipilurj

It comes from the TextVQA's dataset.

Rosetta OCR tokens [v0.2]

haotian-liu avatar Feb 03 '24 05:02 haotian-liu