Otter icon indicating copy to clipboard operation
Otter copied to clipboard

Question about best prompt style for classification

Open vishaal27 opened this issue 1 year ago • 2 comments

Hey, I have a question around the best prompt format for evaluating Otter models (both the MPT7B and LLaMA7B variants).

Currently, I am evaluating Otter on image classification using the following prompt style:

PROMPT = '<image> Q: Describe the image. A: This is an image of a {}.'

Is it better to switch to the below style of prompting for this sort of a task in your experience?

PROMPT = '<image> User: Describe the image. GPT: This is an image of a {}.'

vishaal27 avatar Jul 26 '23 16:07 vishaal27

In my initial experiments, it seems that the format

PROMPT = '<image> Q: Describe the image. A: This is an image of a {}.'

is slightly better for both LLaMA and MPT models. This is a bit surprising since this deviates from the instruction template that was used during instruction tuning, any thoughts on this @Luodian?

vishaal27 avatar Jul 26 '23 17:07 vishaal27

In your case:

cur_instruction = f"Describe the image. This is an image of a"

For MPT: PROMPT = "<image>User: {cur_instruction} GPT:"

For LLama2: wrap_sys = f"<<SYS>>\nYou are a helpful vision language assistant. You are able to understand the visual content that the user provides, and assist the user with a variety of tasks using natural language.\n<</SYS>>\n\n" PROMPT = "[INST]{wrap_sys}<image>{cur_instruction}[/INST]"

ZhangYuanhan-AI avatar Jul 27 '23 01:07 ZhangYuanhan-AI