OpenHands
OpenHands copied to clipboard
[Bug]: OBSERVATION sudo: unable to resolve host opendevin_sandbox: Temporary failure in name resolution
Is there an existing issue for the same bug?
- [X] I have checked the troubleshooting document at https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting
- [X] I have checked the existing issues.
Describe the bug
What's unusual with your very cool product Devin, is it's difficult to be certain what's an "actual bug" and a "meta bug".
Devin's job is clearly to deal with bugs that arise in its own development cycle, but I'm not certain at present, when to "come to its aid".
But this does look to me some sort of config error in Devin's setup.
Apologies in advance and do please close, if not.
Longer report below.
Current Version
0.4.0
Installation and Configuration
docker run -e LLM_PROVIDER="ollama" -e LLM_API_KEY="ollama" -e LLM_BASE_URL="http://host.docker.internal:11434" -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE -v $WORKSPACE_BASE:/opt/workspace_base -v /var/run/docker.sock:/var/run/docker.sock -p 8889:8888 -p 3000:3000 --add-host host.docker.internal=host-gateway ghcr.io/opendevin/opendevin:0.4.0
Model and Agent
Ollama llama3:70b (running locally)
Agent: CmdRunAction
Reproduction Steps
Initialise with Model Commence with default agent ("Monologue Agent")
Enter the following "brief" into the chat:
"Use Whisper from OpenAI to develop an application which can take local file test.aac as input, and transcribe it to a text document"
Logs, Errors, Screenshots, and Additional Context
STEP 10
08:44:21 - PLAN Use Whisper from OpenAI to develop an application which can take local file test.aac as input, and transcribe it to a text document (please ignore below, unless it's relevant)
08:45:14 - ACTION
CmdRunAction(command='sudo apt-get update && sudo apt-get install -y ffmpeg', background=False, action='run')
08:47:10 - OBSERVATION
sudo: unable to resolve host opendevin_sandbox: Temporary failure in name resolution
Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy InRelease [270 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
.
.
Get:169 http://archive.ubuntu.com/ubuntu jammy/universe amd64 pocketsphinx-en-us all 0.8.0+real5prealpha+1-14ubuntu1 [27.6 MB]
Fetched 139 MB in 1min 43s (1356 kB/s)
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/a/alsa-lib/libasound2-data_1.2.6.1-1ubuntu1_all.deb 400 Bad Request [IP: 91.189.91.83 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
I'm sure you're already considering this, but would be helpful to have more verbose levels of logging/debug/diagnostics, where it's really clear the source(s) of the error, i.e. what environment they're originating from. As well as the issue reported above I can't, for example, reconcile the console output from the local docker environment, and the chat reports (in the UI). Not a problem as such, just mentioning generically.
Happy to raise this in https://github.com/OpenDevin/OpenDevin/discussions/categories/ideas if that's helpful.
Or I may even end up creating a PR
Thanks SmartManoj :) That's a helpful article.
Were you saying you know that is THE solution here or just attaching it for reference?
Just a reference
Thanks for your help.
I think I also need to get more to the core of the issue if possible. Questions like, a) Which environment does it stem from (host, or one of the docker environments). b) Does the issue affect others. c) And, is the config change that needs to be made a prerequisite installation step for anyone? (I think not).
I dont see the same issue (tried also 0.4.0 with llama3-70b on local ollama with your prompt. For me opendevin gets into trouble at another step, but i agree we need better instrumentation at different levels - to be able to see prompts, responses etc... It feels like llama3-70b might be capable to work with OD, but probably differences in reactions to prompts vs gpt4/claude are possibly driving agent off-rails and it is difficult to debug.
For the record this is the issue that comes up for me with your prompt:
==============
STEP 16
12:28:33 - PLAN
Use Whisper from OpenAI to develop an application which can take local file test.aac as input, and transcribe it to a text document
12:28:37 - ACTION
NullAction(action='null')
12:28:37 - OBSERVATION
'action' key is not found in action={}
==============
STEP 17
12:28:37 - PLAN
Use Whisper from OpenAI to develop an application which can take local file test.aac as input, and transcribe it to a text document
12:28:44 - opendevin:ERROR: agent_controller.py:102 - Error in loop
Traceback (most recent call last):
File "/app/.venv/lib/python3.12/site-packages/json_repair/json_repair.py", line 360, in repair_json
parsed_json = json.loads(json_str)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 79)
...
Thanks @barsuna, yes, I've seen that issue too, more than once. Also, for the record, here's one example:
==============
STEP 1
18:39:54 - PLAN
Use Whisper from OpenAI to develop an application which can take a .aac file as input, and transcribe it to a text document
18:40:13 - opendevin:ERROR: agent_controller.py:102 - Error in loop
Traceback (most recent call last):
File "/app/.venv/lib/python3.12/site-packages/json_repair/json_repair.py", line 360, in repair_json
parsed_json = json.loads(json_str)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 7 column 1 (char 93)
"It feels like llama3-70b might be capable to work with OD, but probably differences in reactions to prompts vs gpt4/claude are possibly driving agent off-rails and it is difficult to debug."
Makes total sense. Although I have a multi-gpu setup locally, and I'm keen to use them.
Having said that, Groq's performance is breathtaking, but I run out of credit after the first few calls, and they don't seem to be offering a paid service yet.
So I'm likely to be using llama3-70b (local) for the time being... Should I report bugs, or will they effectively be spam? (Who should I ask?)
(Maybe there's somewhere better for a conversation like this)
@barsuna LLM Logs are stored in the logs/llm
folder.
You can use DEBUG=1 in your environment if necessary, and yes, check logs/llm folder.
Ouch to that JSON error. The LLM returned some response that doesn't respect what was requested of it. However, opendevin should behave more gracefully here, and return the error to the LLM to hopefully self-correct next time, not fail with 'error in loop'. That was an oversight, will fix.
It would be interesting, although it's not necessary for this, if you can get the response log file that lead to that error, and post it here? I'd love to see how it was broken.
Re: llama3 and differences in prompts. FWIW I tend to think the same, that must have some (unintended) effect. One thing that was sometimes discussed in issues here, that could help, is to have an agent specialized or optimized for local llms or some particular ones. If you wish to try such experiment, it will be welcome I'm sure.
Re: original error. Opendevin_sandbox is the ssh docker container in which 'run' command is supposed to run. You can inspect in docker: the id of the sandbox container. But I'm not sure why that error happened. It may help to upgrade from 0.4.0 because there have been multiple recent fixes to the sandbox, including some related to networking.
This is a strange bug. Seems to me like it's an issue on the host system, but would love to hear what the resolution is if you find one
@gbenaa @barsuna Could you please do poetry show json_repair
in opendevin's directory and post the result? It should be clear from release, but just double-checking, if you don't mind.
@gbenaa @barsuna Could you please do
poetry show json_repair
in opendevin's directory and post the result? It should be clear from release, but just double-checking, if you don't mind.
sure @enyst, see below:
RuntimeError
The Poetry configuration is invalid:
- Additional properties are not allowed ('group' was unexpected)
at /usr/lib/python3/dist-packages/poetry/core/factory.py:43 in create_poetry
39│ message = ""
40│ for error in check_result["errors"]:
41│ message += " - {}\n".format(error)
42│
→ 43│ raise RuntimeError("The Poetry configuration is invalid:\n" + message)
44│
45│ # Load package
46│ name = local_config["name"]
47│ version = local_config["version"]
I'd be happy to paste any more diagnostics I can, still trying to work out the best ways to access them, especially comprehensive ones. To set "DEBUG=1" that @enyst described above - I'm taking it that just refers to an environment variable, so I've added it into my docker run command, as below (please kindly let me know if I misunderstood)
docker run -e DEBUG=1 LLM_BASE_URL=... etc
Happy to report any findings from e.g. below (in the docker environment) whenever helpful:
/app/logs/opendevin_2024-05....log
thanks @all for your suggestions - will try to make some more testing with these additional tips.
@enyst inside containers (0.4.0) i see no poetry, this is the version of json_repair as shown by pip
root@66b3782b756c:/app# pip list | grep json_repair
json_repair 0.14.0
For the JSON issue: please try the newer version, 0.5.2. It includes an updated json_repair library and it matters, it has fixes to some weird issues, and I think opendevin too had related fixes since 0.4.0. I took a look at this and I don't think it should happen anymore, but if it does, please do let us know.
@gbenaa You're right, with -e. I saw mixed reports on whether it is even needed, for some it is. In any case it should save more detailed logs now for you. Take a look in /app/logs/llm, in addition to the general opendevin log, it should have easily readable files with the prompts, and more importantly, the exact responses sent by the LLM.
Not necessary, just somehow related to your questions on getting information, if you want, you could also try the development setup. I mean, you can run opendevin by running on docker, as you do, or you can this way: https://github.com/OpenDevin/OpenDevin/blob/main/Development.md
Thanks @gbenaa i dont see the json error in 0.5.2 anymore. The model does indeed at times produce empty output... something to debug on the llm side - not an OpenDevin worry...
I did hit an issue in 0.5.2 that initially precluded me from testing all above - the agent would have init error
10:04:10 - opendevin:ERROR: exec_box.py:59 - Error creating controller. Please check Docker is running and visit `https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting` for more debugging information.
10:04:10 - opendevin:ERROR: agent.py:168 - Error creating controller: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))
it seems to do with the fact that in the newer version the SANDBOX_USER_ID is required and insides of the OD container are run under that user. In the entrypoint.sh the 'enduser' user is create and assigned to the docker group. Problem in my case was that the docker group (whos id was correctly identified) didnt exist inside container.
I ended up changing entrypoint.sh like this to make it work:
echo "Docker socket group id: $DOCKER_SOCKET_GID"
echo "Creating group"
groupadd -g $DOCKER_SOCKET_GID -U enduser docker
#usermod -aG $DOCKER_SOCKET_GID enduser
perhaps something to update in entrypoint.sh and/or troubleshooting page.
Now my testing and experiments are paused again - the combined power draw from GPUs is tripping power supply overcurrent protection... need to attend to power issues now :)
For others trying to run locally llama3 or other models - i've made a small wrapper API around llama.cpp server that converts API calls to ollama to llama.cpp api calls (only /generate seems to be really needed anyway) - it allows to observe prompts / responses in real time... impressive to watch the inner monologue grow...
Thanks for all the updates @barsuna @enyst.
Using 0.5.2* now as you suggested, but am still getting an error (see below), seems pretty much the same as before. FYI here is the poetry/json_repair version within the container:
root@ddd3fb2a706d:/app# pip list | grep json_repair
json_repair 0.16.3
Error detail as follows: (not sure if you want the whole log from /app/logs/opendevin_ .. .log (?)
==============
STEP 10
15:00:50 - opendevin:INFO: prompt.py:185
INFO
HINT:
Look at your last thought in the history above. What does it suggest? Don't think anymore--take action.
15:01:24 - opendevin:ERROR: agent_controller.py:151 - Error in loop
Traceback (most recent call last):
File "/app/opendevin/controller/agent_controller.py", line 146, in _run
finished = await self.step(i)
^^^^^^^^^^^^^^^^^^
File "/app/opendevin/controller/agent_controller.py", line 269, in step
action = self.agent.step(self.state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/agenthub/planner_agent/agent.py", line 46, in step
action = parse_response(action_resp)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/agenthub/planner_agent/prompt.py", line 208, in parse_response
action_dict = json.loads(response)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 6 column 2 (char 79)
15:01:24 - opendevin:INFO: agent_controller.py:201 - Setting agent state from AgentState.RUNNING to AgentState.ERROR
*I double-checked I am using 0.5.2
@gbenaa can you please check in ./logs/llm/, there should be directories with the time of the run, and in them, the prompts and responses exchanged?
Edit: aaah, planner agent. Planner agent is not using our attempts to fix JSON. There is a PR for that. Can you try CodeAct agent meanwhile?
@enyst @barsuna Happy to test this whenever's good
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
@gbenaa just wanted to follow up. Do you see see your issue with the 0.6.2 version?
Going to close this issue. Please try 0.6.2 or main and open any issues you run into.