OmniParser icon indicating copy to clipboard operation
OmniParser copied to clipboard

Use the qwen model, json.loads () function error

Open itimetime opened this issue 1 day ago • 0 comments

When I use the qwen model, the returned vlm_response will not conform to the JSON format, resulting in omnitool/gradio/agent/vlm_agent.py line 147, json.loads () function error, JSON details are as follows.

{
    "Reasoning": "The current screen shows the desktop with various application icons. The task is to purchase a hard drive from Amazon, and the next step involves opening a web browser. Since Google Chrome is visible on the desktop, clicking it will allow us to proceed to the Amazon website.",
    "Next Action": "left_click",
    "Box ID": 6,
}

There should not be a "," after the value corresponding to the last key, I will use it temporarily

   if vlm_response_json[-3] == ",":
            vlm_response_json = vlm_response_json[:-3] + "\n}"
{
    "Reasoning": "The current screen shows the desktop with various application icons. The task is to purchase a hard drive from Amazon, and the next step involves opening a web browser. Since Google Chrome is visible on the desktop, clicking it will allow us to proceed to the Amazon website.",
    "Next Action": "left_click",
    "Box ID": 6
}

Handle it, hope the official also knows.

itimetime avatar Feb 28 '25 06:02 itimetime