OmniParser
OmniParser copied to clipboard
Use the qwen model, json.loads () function error
When I use the qwen model, the returned vlm_response will not conform to the JSON format, resulting in omnitool/gradio/agent/vlm_agent.py line 147, json.loads () function error, JSON details are as follows.
{
"Reasoning": "The current screen shows the desktop with various application icons. The task is to purchase a hard drive from Amazon, and the next step involves opening a web browser. Since Google Chrome is visible on the desktop, clicking it will allow us to proceed to the Amazon website.",
"Next Action": "left_click",
"Box ID": 6,
}
There should not be a "," after the value corresponding to the last key, I will use it temporarily
if vlm_response_json[-3] == ",":
vlm_response_json = vlm_response_json[:-3] + "\n}"
{
"Reasoning": "The current screen shows the desktop with various application icons. The task is to purchase a hard drive from Amazon, and the next step involves opening a web browser. Since Google Chrome is visible on the desktop, clicking it will allow us to proceed to the Amazon website.",
"Next Action": "left_click",
"Box ID": 6
}
Handle it, hope the official also knows.