Qwen2.5-VL
Qwen2.5-VL copied to clipboard
Qwen2.5-vl, prompt for screenspot-pro evaluation
When I tested qwen2-vl on Screenspot-Pro, the output format of the model remained stable as "<| object_def_start |>the selection<| object_def_end |><| box_start |>(501,10), (995987)<| box_dend |><| im_dend |>".
When I tried to verify qwen2.5-vl, I kept the prompt consistent with qwen2-vl, ( this prompt is provided by Screenspot-Pro)
prompt_origin = 'Output the bounding box in the image corresponding to the instruction "{}" with grounding.' full_prompt = prompt_origin.format(instruction) messages = [ { "role": "user", "content": [ { "type": "image", "image": image_path, }, {"type": "text", "text": full_prompt}, ], } ]
but the model output format would be different, such as the following:
Do you have any recommended prompts for qwen2.5vl? How to ensure consistency of output for better evaluation?
Thank you!!