Can not open edge browser using ollama/llama3?
UFO launched successfully, AI model knows the user intent, but failed to open the edge browser. After 30 steps, still failed. but it shows Evaluation results: the agent successfully opened the edge browser.
Detailed Steps:
"Please enter your request to be completed@: open the edge Round 1, Step 1, HostAgent: Analyzing the user intent and decomposing the request ... Observations@: I observe that Edge browser is not visible in the screenshot, nor available in the list of applications. So I need to open the Edge application directly. Thoughts : The user request can be solely complete on the Edge browser. However, the Edge browser is not visible in the screenshot, nor available in the list of applications. I need to first open the Edge browser to open it. Running Bash Command: start edge Plans Q: (1) Open the Edge browser. Next Selected application: None"
Could you provide the response.log and request.log files from the logs/[task_name] dir in a pastebin? This can greatly help us pinpoint out the issue
Update the log response.log
Update the log response.log
In the provided log, I saw "Bash": "start edge",
The LLM model is trying to launch Edge using start edge, but the correct behavior is start msedge. This is most likely due to limited capability of the LLM model used. I tested with gpt-4o and gpt-4.1 and they all seem to be fine.
If you want to fix the issue for this model, I recommend adding a line in the host agent prompt forcing the model to use start msedge. The relevant prompt can be found in ufo/prompts/share/base/host_agent.yaml.