OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: browser not working

Open x66ccff opened this issue 1 year ago • 1 comments

Is there an existing issue for the same bug?

  • [X] I have checked the troubleshooting document at https://docs.all-hands.dev/modules/usage/troubleshooting
  • [X] I have checked the existing issues.

Describe the bug

I try to ask the agent Please goto('https://www.whitehouse.gov/about-the-white-house/presidents/') however the browser only become about:blank and the log keep reporting

AgentFinishAction(outputs={'content': 'Too many errors encountered. Task failed.'}, thought='', action='finish')

But finally the agent get the web page by using the IPython

Current OpenHands version

docker run -it --pull=never    \
 -e SANDBOX_RUNTIME_CONTAINER_IMAGE=kk-oh-env  \
   -e SANDBOX_USER_ID=$(id -u)   \
  -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
-e RUN_AS_OPENHANDS=False \
 -e MAX_ITERATIONS=1000 \
 -e LLM_NUM_RETRIES=20   \
-e LLM_RETRY_MIN_WAIT=30 \
-e LLM_RETRY_MAX_WAIT=700 \
-e DEBUG=True \
-v $WORKSPACE_BASE:/opt/workspace_base     \
-v /var/run/docker.sock:/var/run/docker.sock  \
   -p 3000:3000 \
    --add-host host.docker.internal:host-gateway  \
  --name openhands-app-$(date +%Y%m%d%H%M%S)   \
  ghcr.io/all-hands-ai/openhands:0.9.8

Installation and Configuration

docker run -it --pull=never    \
 -e SANDBOX_RUNTIME_CONTAINER_IMAGE=kk-oh-env  \
   -e SANDBOX_USER_ID=$(id -u)   \
  -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE  \
-e RUN_AS_OPENHANDS=False \
 -e MAX_ITERATIONS=1000 \
 -e LLM_NUM_RETRIES=20   \
-e LLM_RETRY_MIN_WAIT=30 \
-e LLM_RETRY_MAX_WAIT=700 \
-e DEBUG=True \
-v $WORKSPACE_BASE:/opt/workspace_base     \
-v /var/run/docker.sock:/var/run/docker.sock  \
   -p 3000:3000 \
    --add-host host.docker.internal:host-gateway  \
  --name openhands-app-$(date +%Y%m%d%H%M%S)   \
  ghcr.io/all-hands-ai/openhands:0.9.8

Model and Agent

qwen2.5:72b

Operating System

linux

Reproduction Steps

No response

Logs, Errors, Screenshots, and Additional Context

07:50:33 - openhands:INFO: agent_controller.py:253
USER_ACTION
**MessageAction** (source=EventSource.USER)
CONTENT: Please goto('https://www.whitehouse.gov/about-the-white-house/presidents/')
07:50:33 - openhands:DEBUG: agent_controller.py:273 - [Agent Controller 774e605f-7a98-4b99-b06b-492317a2b1d1] Setting agent(CodeActAgent) state from AgentState.FINISHED to AgentState.RUNNING
07:50:33 - openhands:DEBUG: stream.py:135 - Adding AgentStateChangedObservation id=22 from AGENT
07:50:33 - openhands:DEBUG: stream.py:135 - Adding NullObservation id=23 from USER
07:50:33 - openhands:INFO: agent_controller.py:229
OBSERVATION
AgentStateChangedObservation(content='', agent_state=<AgentState.RUNNING: 'running'>, observation='agent_state_changed')
07:50:33 - openhands:INFO: agent_controller.py:229
OBSERVATION
NullObservation(content='', observation='null')

==============
CodeActAgent LEVEL 0 LOCAL STEP 4 GLOBAL STEP 4

07:50:34 - openhands:DEBUG: logger.py:238 - Logging to /app/logs/llm/24-10-07_07-12/prompt_038.log
07:50:37 - openhands:DEBUG: logger.py:238 - Logging to /app/logs/llm/24-10-07_07-12/response_038.log
07:50:37 - openhands:INFO: llm.py:308 - Input tokens: 4427 | Output tokens: 39

07:50:37 - openhands:DEBUG: stream.py:135 - Adding AgentDelegateAction id=24 from AGENT
07:50:37 - openhands:INFO: agent_controller.py:448
ACTION
AgentDelegateAction(agent='BrowsingAgent', inputs={'task': 'Sure! Let me browse the provided URL.. I should start with: Tell me what is in "https://www.whitehouse.gov/about-the-white-house/presidents/"'}, thought='', action='delegate')
07:50:37 - openhands:WARNING: llm.py:93 - Could not get model info for ollama/kwen2.5:72b:
This model isn't mapped yet. model=ollama/kwen2.5, custom_llm_provider=ollama. Add it here - https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json.
07:50:37 - openhands:INFO: agent_controller.py:364 - [Agent Controller 774e605f-7a98-4b99-b06b-492317a2b1d1]: start delegate, creating agent BrowsingAgent using LLM LLM(model=ollama/kwen2.5:72b, base_url=http://192.168.1.103:11434/)
07:50:37 - openhands:DEBUG: agent_controller.py:273 - [Agent Controller 774e605f-7a98-4b99-b06b-492317a2b1d1-delegate] Setting agent(BrowsingAgent) state from AgentState.LOADING to AgentState.RUNNING
07:50:37 - openhands:DEBUG: stream.py:135 - Adding AgentStateChangedObservation id=25 from AGENT
07:50:37 - openhands:DEBUG: stream.py:135 - Adding NullObservation id=26 from AGENT
07:50:37 - openhands:INFO: agent_controller.py:229
OBSERVATION
AgentStateChangedObservation(content='', agent_state=<AgentState.RUNNING: 'running'>, observation='agent_state_changed')
07:50:37 - openhands:INFO: agent_controller.py:229
OBSERVATION
NullObservation(content='', observation='null')
07:50:37 - openhands:DEBUG: agent_controller.py:456 - [Agent Controller 774e605f-7a98-4b99-b06b-492317a2b1d1] Delegate not none, awaiting...


==============
BrowsingAgent LEVEL 1 LOCAL STEP 0 GLOBAL STEP 5

07:50:37 - openhands:DEBUG: logger.py:238 - Logging to /app/logs/llm/24-10-07_07-12/prompt_039.log
07:50:42 - openhands:DEBUG: logger.py:238 - Logging to /app/logs/llm/24-10-07_07-12/response_039.log
07:50:42 - openhands:WARNING: llm.py:374 - Cost calculation not supported for this model.
07:50:42 - openhands:INFO: llm.py:308 - Input tokens: 833 | Output tokens: 54

07:50:42 - openhands:DEBUG: response_parser.py:29 - To start browsing the provided URL, I will use the `goto` action to navigate to "https://www.whitehouse.gov/about-the-white-house/presidents/".

```python
goto('https://www.whitehouse.gov/about-the-white-house/presidents)```
07:50:42 - openhands:DEBUG: stream.py:135 - Adding BrowseInteractiveAction id=27 from AGENT
07:50:42 - openhands:INFO: agent_controller.py:448
ACTION
**BrowseInteractiveAction**
THOUGHT: To start browsing the provided URL, I will use the `goto` action to navigate to "https://www.whitehouse.gov/about-the-white-house/presidents/".
BROWSER_ACTIONS: python
goto('https://www.whitehouse.gov/about-the-white-house/presidents)
07:50:42 - openhands:DEBUG: agent_controller.py:458 - [Agent Controller 774e605f-7a98-4b99-b06b-492317a2b1d1] Delegate step done
07:50:42 - openhands:DEBUG: agent_controller.py:461 - [Agent Controller 774e605f-7a98-4b99-b06b-492317a2b1d1] Delegate state: AgentState.RUNNING
07:50:42 - openhands:DEBUG: runtime.py:307 - Getting container logs...
07:50:43 - openhands:DEBUG: runtime.py:307 - Getting container logs...
07:50:43 - openhands:INFO: runtime.py:316 -
-----------------------------------Container logs:-----------------------------------
    |07:50:42 - openhands:DEBUG: client.py:365 - Running action:
    |**BrowseInteractiveAction**
    |THOUGHT: To start browsing the provided URL, I will use the `goto` action to navigate to "https://www.whitehouse.gov/about-the-white-house/presidents/".
    |BROWSER_ACTIONS: python
    |goto('https://www.whitehouse.gov/about-the-white-house/presidents)
    |07:50:43 - openhands:DEBUG: client.py:367 - Action output:
    |**BrowserOutputObservation**
    |URL: about:blank
    |Error: True
    |Open pages: ['about:blank']
    |Active page index: 0
    |Last browser action: python
    |goto('https://www.whitehouse.gov/about-the-white-house/presidents)
    |Last browser action error: ValueError: Received an empty action.
    |Focused element bid: 2
    |axTree: {'nodes': [{'nodeId': '4', 'ignored': False, 'role': {'type': 'internalRole', 'value': 'RootWebArea'}, 'chromeRole': {'type': 'internalRole', 'value': 144}, 'name': {'type': 'computedString', 'value': '', 'sources': [{'type': 'relatedElement', 'attribute': 'aria-labelledby'}, {'type': 'attribute', 'attribute': 'aria-label'}, {'type': 'attribute', 'attribute': 'aria-label', 'superseded': True}, {'type': 'relatedElement', 'nativeSource': 'title'}, {'type': 'attribute', 'attribute': 'title', 'superseded': True}]}, 'properties': [{'name': 'focusable', 'value': {'type': 'booleanOrUndefined', 'value': True}}, {'name': 'focused', 'value': {'type': 'booleanOrUndefined', 'value': True}}], 'childIds': ['5'], 'backendDOMNodeId': 2, 'frameId': '78F66B30411C54487A1793BF48FE955B'}, {'nodeId': '5', 'ignored': True, 'ignoredReasons': [{'name': 'uninteresting', 'value': {'type': 'boolean', 'value': True}}], 'role': {'type': 'role', 'value': 'none'}, 'chromeRole': {'type': 'internalRole', 'value': 0}, 'parentId': '4', 'childIds': ['6'], 'backendDOMNodeId': 3}, {'nodeId': '6', 'ignored': False, 'role': {'type': 'role', 'value': 'generic'}, 'chromeRole': {'type': 'internalRole', 'value': 88}, 'name': {'type': 'computedString', 'value': '', 'sources': [{'type': 'relatedElement', 'attribute': 'aria-labelledby'}, {'type': 'attribute', 'attribute': 'aria-label'}, {'type': 'attribute', 'attribute': 'title'}]}, 'properties': [], 'parentId': '5', 'childIds': [], 'backendDOMNodeId': 5, 'browsergym_id': '2'}]}
    |CONTENT:
    |
    |
    |INFO:     172.17.0.1:36776 - "POST /execute_action HTTP/1.1" 200 OK
--------------------------------------------------------------------------------
07:50:43 - openhands:DEBUG: stream.py:135 - Adding BrowserOutputObservation id=28 from AGENT
07:50:43 - openhands:INFO: agent_controller.py:229
OBSERVATION
**BrowserOutputObservation**
URL: about:blank
Error: True
Open pages: ['about:blank']
Active page index: 0
Last browser action: python
goto('https://www.whitehouse.gov/about-the-white-house/presidents)
Last browser action error: ValueError: Received an empty action.
Focused element bid: 2
axTree: {'nodes': [{'nodeId': '4', 'ignored': False, 'role': {'type': 'internalRole', 'value': 'RootWebArea'}, 'chromeRole': {'type': 'internalRole', 'value': 144}, 'name': {'type': 'computedString', 'value': '', 'sources': [{'type': 'relatedElement', 'attribute': 'aria-labelledby'}, {'type': 'attribute', 'attribute': 'aria-label'}, {'type': 'attribute', 'attribute': 'aria-label', 'superseded': True}, {'type': 'relatedElement', 'nativeSource': 'title'}, {'type': 'attribute', 'attribute': 'title', 'superseded': True}]}, 'properties': [{'name': 'focusable', 'value': {'type': 'booleanOrUndefined', 'value': True}}, {'name': 'focused', 'value': {'type': 'booleanOrUndefined', 'value': True}}], 'childIds': ['5'], 'backendDOMNodeId': 2, 'frameId': '78F66B30411C54487A1793BF48FE955B'}, {'nodeId': '5', 'ignored': True, 'ignoredReasons': [{'name': 'uninteresting', 'value': {'type': 'boolean', 'value': True}}], 'role': {'type': 'role', 'value': 'none'}, 'chromeRole': {'type': 'internalRole', 'value': 0}, 'parentId': '4', 'childIds': ['6'], 'backendDOMNodeId': 3}, {'nodeId': '6', 'ignored': False, 'role': {'type': 'role', 'value': 'generic'}, 'chromeRole': {'type': 'internalRole', 'value': 88}, 'name': {'type': 'computedString', 'value': '', 'sources': [{'type': 'relatedElement', 'attribute': 'aria-labelledby'}, {'type': 'attribute', 'attribute': 'aria-label'}, {'type': 'attribute', 'attribute': 'title'}]}, 'properties': [], 'parentId': '5', 'childIds': [], 'backendDOMNodeId': 5, 'browsergym_id': '2'}]}
CONTENT:

....

x66ccff avatar Oct 07 '24 08:10 x66ccff

I think we have just solved this bug here: https://github.com/All-Hands-AI/OpenHands/pull/4226

It's too recent and it's not yet part of a release. If you wish, you can:

  • use the same docker command, except the image on the last line: ghcr.io/all-hands-ai/openhands:main (it will be unstable at times!)
  • set up a development version to use more easily whatever you like or apply fixes directly: https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md

Please note that the browsing agent is still experimental, and it's possible there are other issues too. As far as I know, we have some plans to revisit it and improve it.

enyst avatar Oct 07 '24 09:10 enyst

Going to close this since we have released. Please let us know if you still run into issues. Definitely there is lots of improvements needed on the browser.

mamoodi avatar Nov 01 '24 17:11 mamoodi

seems oddly similar, so just in case : https://github.com/All-Hands-AI/OpenHands/issues/7861

AnnoyingTechnology avatar Apr 15 '25 09:04 AnnoyingTechnology