LAVIS zero-shot accuracy of instructblip on ScienceQA

Thank you for your outstanding work.

I noticed that the paper mentions "For ScienceQA, we only evaluate the set with image context." Does this mean that the hint or context in the dataset is not used?
What is the prompt template used by instructblip_flant5xl on ScienceQA?
I use "Question: {}\n{}" as prompt and the zero-shot accuracy of instruct_flant5xl on ScienceQA i obtained is 64.6。Are there any things I need to pay attention to?

May 24 '23 16:05 hongliang-wei

Thanks for your interests in our work!

It means we only use the questions with the image context (the IMG set). We do include the textual context if exists.
and 3. "Context: {} Question: {} Options: {}. Answer:", options are separated by (a), (b), (c), (d). In addition, we use answer ranking for interence as mentioned in Section 2.5

May 25 '23 07:05 wenliangdai

Thanks for the quick response.

In ScienceQA, some questions have options with images. Does InstructBlip utilize these images?

May 26 '23 02:05 hongliang-wei

What did you use as Context in ScienceQA? I used "hint" in scineceQA but I got only 47.8% accuracy.

And Did you use predict_class in ScienceQATask's valid_step?

Sep 05 '23 11:09 engineerA314