UI-TARS icon indicating copy to clipboard operation
UI-TARS copied to clipboard

UI-TARS-1.5-7B Endless Loops on Web Interfaces

Open JumpingRain opened this issue 8 months ago • 1 comments

We've observed that UI-TARS-1.5-7B agent frequently gets stuck in endless retry loops when interacting with web interfaces. The agent repeatedly attempts the same ineffective action, unable to adapt or learn new strategies.

Example of observed behavior:

Thought: I see a search button at the top of the page, it's the magnifying glass icon in the upper right corner. To find information about Apple Pencil, I need to click this search button to open the search box.
Action: click(start_box='(1099,64)')
Thought: I see a search button at the top of the page, it's the magnifying glass icon in the upper right corner. To find information about Apple Pencil, I need to click this search button to open the search box.
Action: click(start_box='(1099,64)')
Thought: I see a search button at the top of the page, it's the magnifying glass icon in the upper right corner. To find information about Apple Pencil, I need to click this search button to open the search box.
Action: click(start_box='(1099,64)')

[This exact same Thought and Action repeats 15x times without change]

We also need deployment guidelines for web operations, including:

  1. Recommended screen resolution settings
  2. Input prompt formatting and concatenation methods for web tasks
  3. Specific configuration steps for web deployment

Please provide relevant documentation or consider creating web-specific deployment guidelines.

JumpingRain avatar Apr 21 '25 09:04 JumpingRain

We have provided documentation covering deployment and inference procedures. For questions 1 and 2, please refer to the following section of the UI-TARS repository: 👉 Quick Start Guide: Deploying and Using Our Model

For question 3 regarding configuration steps, you may refer to the UI-TARS-desktop deployment guide here: 👉 UI-TARS-Desktop Quick Start

JjjFangg avatar May 07 '25 14:05 JjjFangg