Qwen2.5-VL Qwen2.5-vl, prompt for screenspot-pro evaluation

Qwen2.5-vl, prompt for screenspot-pro evaluation

Open Ancolie18 opened this issue 2 weeks ago • 1 comments

When I tried to verify qwen2.5-vl, I kept the prompt consistent with qwen2-vl, ( this prompt is provided by Screenspot-Pro)

prompt_origin = 'Output the bounding box in the image corresponding to the instruction "{}" with grounding.'
full_prompt = prompt_origin.format(instruction)
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": image_path,
            },
            {"type": "text", "text": full_prompt},
        ],
    }
]

but the model output format would be different, such as the following:

Do you have any recommended prompts for qwen2.5vl? How to ensure consistency of output for better evaluation?

Thank you!!

Feb 11 '25 06:02 Ancolie18

Qwen2.5-VL Qwen2.5-VL copied to clipboard

Qwen2.5-vl, prompt for screenspot-pro evaluation

Qwen2.5-VL
Qwen2.5-VL copied to clipboard