UI-TARS icon indicating copy to clipboard operation
UI-TARS copied to clipboard

UITARS prompt for visual grounding only

Open tcnguyen opened this issue 8 months ago • 1 comments

Currently the prompt need a task description and action history. If I want to use UITARS for visual grounding only, is this possible? What is the prompt you used for visual grounding benchmarking? Thank you.

tcnguyen avatar Apr 22 '25 15:04 tcnguyen

Here are an example of visual grounding task.

"""You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. \n\n## Output Format\n\nAction: ...\n\n\n## Action Space\nclick(start_box='<|box_start|>(x1,y1)<|box_end|>')\n\n## User Instruction\n{instruction} """

JjjFangg avatar Apr 25 '25 03:04 JjjFangg

@JjjFangg Thank you!

tcnguyen avatar Apr 29 '25 14:04 tcnguyen