OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: sandbox timeout can use better handling and flexibility

Open barsuna opened this issue 9 months ago • 1 comments

Is there an existing issue for the same bug?

  • [X] I have checked the troubleshooting document at https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting
  • [X] I have checked the existing issues.

Describe the bug

Current fixed (2 minute) timeout is leaving agent uninformed about what is going on and often gets it off-rails. For example pip install of something with large dependencies or requiring build often doesn't complete within 2 minutes, i.e. pip install openai-whisper will pull torch, which will also pull cuda etc

Propose 2 improvements:

  • Allow changing timeout in UI and/or via env var
  • When timeout occurs the output of the process (both stdout and stderr) should be part of the observation - it appears currently the output is collected, but the agent does not see it, it is only logged

This would allow the agent to better react to the issue -> for example pip install that simply needs more time vs https/proxy issues with pip connections timing out (seen on stderr)

Current Version

ghcr.io/opendevin/opendevin:0.5.2

Installation and Configuration

docker run     --add-host host.docker.internal:host-gateway     -e LLM_API_KEY="ollama"     -e LLM_BASE_URL="http://host.docker.internal:11434"     -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE -e SANDBOX_TYPE=exec  -e SANDBOX_USER_ID=$(id -u)  -v $WORKSPACE_BASE:/opt/workspace_base     -v /var/run/docker.sock:/var/run/docker.sock     -p 3000:3000     ghcr.io/opendevin/opendevin:0.5.2

Model and Agent

Model: ollama/llama3-70b Agent: MonologueAgent

Reproduction Steps

Ask the agent to pip install openai-whisper

Logs, Errors, Screenshots, and Additional Context

09:59:23 - opendevin:INFO: agent_controller.py:201 - Setting agent state from AgentState.INIT to AgentState.INIT 09:59:23 - opendevin:INFO: agent_controller.py:201 - Setting agent state from AgentState.INIT to AgentState.RUNNING

============== STEP 0

09:59:23 - PLAN install an openai-whisper module using pip 09:59:34 - ACTION NullAction(action='null') 09:59:34 - OBSERVATION AgentErrorObservation(content='Invalid JSON, the response must be well-formed JSON as specified in the prompt.', observation='error')

============== STEP 1

09:59:45 - ACTION CmdRunAction COMMAND: pip install openai-whisper 10:01:45 - opendevin:ERROR: exec_box.py:122 - Command timed out, killing process... 10:01:48 - OBSERVATION CmdOutputObservation(content='Command: "pip install openai-whisper" timed out', command_id=-1, command='pip install openai-whisper', exit_code=-1, observation='run')

============== STEP 2

10:02:00 - ACTION AgentThinkAction(thought='I should try to figure out why the pip install command timed out', action='think')

barsuna avatar May 08 '24 09:05 barsuna

I noticed this problem recently. the same timeout for all operations is done for the sake of quickly implementing the functionality in a unified manner.. we will consider distinguishing the operation for different timeouts, such as the init/install needs more time to process.

iFurySt avatar May 08 '24 12:05 iFurySt

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 16 '24 01:06 github-actions[bot]

These are good ideas, and it's not stale.

One of these is done: it is possible to set the timeout via env or config.toml, as SANDBOX_TIMEOUT in env or sandbox_timeout in toml.

enyst avatar Jun 17 '24 18:06 enyst

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jul 18 '24 01:07 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Jul 25 '24 01:07 github-actions[bot]