OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug][Evaluation]: Docker image build failure - 404 Client Error

Open sfc-gh-goliaro opened this issue 8 months ago • 6 comments

Is there an existing issue for the same bug?

  • [x] I have checked the existing issues.

Describe the bug and reproduction steps

I am getting the following error when running the SWE-bench evaluation with multiple workers. I have not experienced the issue when using a single worker, and the likelihood of encountering the problem seems to be proportional to the number of workers.

Instance django__django-12273 - 2025-04-07 02:47:03,801 - ERROR - Image build failed:
Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.31.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-07T02:45:44.735989', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.31.0_brmh4jmv7zvvadp6_3e03rar99wpnawnu', '--load', '--platform=linux/amd64', '/tmp/tmpt310o2qy']' returned non-zero exit status 1.
Instance django__django-12273 - 2025-04-07 02:47:03,802 - ERROR - Command output:

Instance django__django-12273 - 2025-04-07 02:47:03,813 - ERROR - ----------
Error in instance [django__django-12273]: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.31.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-07T02:45:44.735989', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.31.0_brmh4jmv7zvvadp6_3e03rar99wpnawnu', '--load', '--platform=linux/amd64', '/tmp/tmpt310o2qy']' returned non-zero exit status 1.. Stacktrace:
Traceback (most recent call last):
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/api/client.py", line 275, in _raise_for_status
    response.raise_for_status()
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.48/containers/openhands-runtime-239226d4-e9cd-4b30-9650-b5fe5e39f0f6-5e7d05b8802f2c6f/json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data-fast/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 136, in connect
    await call_sync_from_async(self._attach_to_container)
  File "/data-fast/OpenHands/openhands/utils/async_utils.py", line 18, in call_sync_from_async
    result = await coro
             ^^^^^^^^^^
  File "/data-fast/miniforge3/envs/openhands/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data-fast/OpenHands/openhands/utils/async_utils.py", line 17, in <lambda>
    coro = loop.run_in_executor(None, lambda: fn(*args, **kwargs))
                                              ^^^^^^^^^^^^^^^^^^^
  File "/data-fast/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 338, in _attach_to_container
    self.container = self.docker_client.containers.get(self.container_name)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/models/containers.py", line 954, in get
    resp = self.client.api.inspect_container(container_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/api/container.py", line 793, in inspect_container
    return self._result(
           ^^^^^^^^^^^^^
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/api/client.py", line 281, in _result
    self._raise_for_status(response)
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/api/client.py", line 277, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cortex/.cache/pypoetry/virtualenvs/openhands-ai-UR1gHExr-py3.12/lib/python3.12/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.48/containers/openhands-runtime-239226d4-e9cd-4b30-9650-b5fe5e39f0f6-5e7d05b8802f2c6f/json: Not Found ("No such container: openhands-runtime-239226d4-e9cd-4b30-9650-b5fe5e39f0f6-5e7d05b8802f2c6f")

OpenHands Installation

Development workflow

OpenHands Version

main branch (0519e9e3c289e1a93c087c1afcb86db0ca98e7a6)

Operating System

None

Logs, Errors, Screenshots, and Additional Context

The following issues seem to be related: #7568 #6822 #6758

sfc-gh-goliaro avatar Apr 07 '25 03:04 sfc-gh-goliaro

Yeah there's another issue on this. Running parallel docker containers for the eval causes this. The other issue said something about timing.

mamoodi avatar Apr 15 '25 13:04 mamoodi

I am running into the same error when my worker is set to 1 on commit0.

./evaluation/benchmarks/commit0/scripts/run_infer.sh lite llm.eval HEAD CodeActAgent 16 100 1 wentingzhao/commit0_combined test

21:54:54 - openhands:ERROR: shared.py:378 - ----------
Error in instance [tinydb]: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.32.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-16T21:54:39.570054', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.32.0_sql80z77yw7k6ywz_9nugot1jlt200f7k', '--load', '/tmp/tmpcn46dcpi']' returned non-zero exit status 1.. Stacktrace:
Traceback (most recent call last):
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/api/client.py", line 275, in _raise_for_status
    response.raise_for_status()
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.45/containers/openhands-runtime-ebce230b-803d-457b-a320-288bd448686f-dcd69f3a3fe0ed96/json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ec2-user/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 142, in connect
    await call_sync_from_async(self._attach_to_container)
  File "/home/ec2-user/OpenHands/openhands/utils/async_utils.py", line 18, in call_sync_from_async
    result = await coro
             ^^^^^^^^^^
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/utils/async_utils.py", line 17, in <lambda>
    coro = loop.run_in_executor(None, lambda: fn(*args, **kwargs))
                                              ^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 347, in _attach_to_container
    self.container = self.docker_client.containers.get(self.container_name)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/models/containers.py", line 954, in get
    resp = self.client.api.inspect_container(container_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/api/container.py", line 793, in inspect_container
    return self._result(
           ^^^^^^^^^^^^^
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/api/client.py", line 281, in _result
    self._raise_for_status(response)
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/api/client.py", line 277, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/.cache/pypoetry/virtualenvs/openhands-ai-6fwN8oGD-py3.12/lib/python3.12/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.45/containers/openhands-runtime-ebce230b-803d-457b-a320-288bd448686f-dcd69f3a3fe0ed96/json: Not Found ("No such container: openhands-runtime-ebce230b-803d-457b-a320-288bd448686f-dcd69f3a3fe0ed96")

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ec2-user/OpenHands/evaluation/utils/shared.py", line 325, in _process_instance_wrapper
    result = process_instance_func(instance, metadata, use_mp, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/evaluation/benchmarks/commit0/run_infer.py", line 398, in process_instance
    call_async_from_sync(runtime.connect)
  File "/home/ec2-user/OpenHands/openhands/utils/async_utils.py", line 54, in call_async_from_sync
    result = future.result()
             ^^^^^^^^^^^^^^^
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/utils/async_utils.py", line 44, in run
    return asyncio.run(arun())
           ^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/miniconda3/envs/oh/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/utils/async_utils.py", line 37, in arun
    result = await coro
             ^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 156, in connect
    self.runtime_container_image = build_runtime_image(
                                   ^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/runtime/utils/runtime_build.py", line 137, in build_runtime_image
    result = build_runtime_image_in_folder(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/runtime/utils/runtime_build.py", line 232, in build_runtime_image_in_folder
    _build_sandbox_image(
  File "/home/ec2-user/OpenHands/openhands/runtime/utils/runtime_build.py", line 361, in _build_sandbox_image
    image_name = runtime_builder.build(
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/OpenHands/openhands/runtime/builder/docker.py", line 180, in build
    raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.32.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-16T21:54:39.570054', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.32.0_sql80z77yw7k6ywz_9nugot1jlt200f7k', '--load', '/tmp/tmpcn46dcpi']' returned non-zero exit status 1.

oootttyyy avatar Apr 16 '25 22:04 oootttyyy

I also running into the same error when my worker is set to 1 `================ DOCKER BUILD STARTED ================ 16:24:08 - openhands:ERROR: docker.py:188 - Image build failed: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.32.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-17T16:23:37.889824', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.32.0_hwpawchk4pmzi7mr_xo9s2wluz3mhey2g', '--load', '--platform=linux/amd64', '/tmp/tmpdljs4ffz']' returned non-zero exit status 1. 16:24:08 - openhands:ERROR: docker.py:189 - Command output:

16:24:08 - openhands:ERROR: shared.py:356 - Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.32.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-17T16:23:37.889824', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.32.0_hwpawchk4pmzi7mr_xo9s2wluz3mhey2g', '--load', '--platform=linux/amd64', '/tmp/tmpdljs4ffz']' returned non-zero exit status 1. Traceback (most recent call last): File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/api/client.py", line 275, in _raise_for_status response.raise_for_status() File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.45/containers/openhands-runtime-dded28c6-d4a6-44d1-80b4-a6d3d2fa276d-78eb2fda18af55f1/json

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/root/openHands/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 136, in connect await call_sync_from_async(self._attach_to_container) File "/root/openHands/OpenHands/openhands/utils/async_utils.py", line 18, in call_sync_from_async result = await coro ^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/concurrent/futures/thread.py", line 59, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/utils/async_utils.py", line 17, in coro = loop.run_in_executor(None, lambda: fn(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 341, in _attach_to_container self.container = self.docker_client.containers.get(self.container_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/models/containers.py", line 954, in get resp = self.client.api.inspect_container(container_id) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/utils/decorators.py", line 19, in wrapped return f(self, resource_id, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/api/container.py", line 793, in inspect_container return self._result( ^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/api/client.py", line 281, in _result self._raise_for_status(response) File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/api/client.py", line 277, in _raise_for_status raise create_api_error_from_http_exception(e) from e ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception raise cls(e, response=response, explanation=explanation) from e docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.45/containers/openhands-runtime-dded28c6-d4a6-44d1-80b4-a6d3d2fa276d-78eb2fda18af55f1/json: Not Found ("No such container: openhands-runtime-dded28c6-d4a6-44d1-80b4-a6d3d2fa276d-78eb2fda18af55f1")

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/openHands/OpenHands/evaluation/utils/shared.py", line 325, in _process_instance_wrapper result = process_instance_func(instance, metadata, use_mp, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/./evaluation/benchmarks/swe_bench/run_infer.py", line 578, in process_instance call_async_from_sync(runtime.connect) File "/root/openHands/OpenHands/openhands/utils/async_utils.py", line 50, in call_async_from_sync result = future.result() ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/concurrent/futures/_base.py", line 456, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/root/miniconda3/envs/openhands/lib/python3.12/concurrent/futures/thread.py", line 59, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/utils/async_utils.py", line 44, in run return asyncio.run(arun()) ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/asyncio/runners.py", line 195, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/openhands/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/utils/async_utils.py", line 37, in arun result = await coro ^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/runtime/impl/docker/docker_runtime.py", line 150, in connect self.runtime_container_image = build_runtime_image( ^^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/runtime/utils/runtime_build.py", line 137, in build_runtime_image result = build_runtime_image_in_folder( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/runtime/utils/runtime_build.py", line 232, in build_runtime_image_in_folder _build_sandbox_image( File "/root/openHands/OpenHands/openhands/runtime/utils/runtime_build.py", line 361, in _build_sandbox_image image_name = runtime_builder.build( ^^^^^^^^^^^^^^^^^^^^^^ File "/root/openHands/OpenHands/openhands/runtime/builder/docker.py", line 180, in build raise subprocess.CalledProcessError( subprocess.CalledProcessError: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.32.0', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2025-04-17T16:23:37.889824', '--tag=ghcr.io/all-hands-ai/runtime:oh_v0.32.0_hwpawchk4pmzi7mr_xo9s2wluz3mhey2g', '--load', '--platform=linux/amd64', '/tmp/tmpdljs4ffz']' returned non-zero exit status 1. `

DCJsenior avatar Apr 17 '25 08:04 DCJsenior

@xingyaoww if you have a few minutes, any idea what's happening here for these users? Seems like different problems.

mamoodi avatar Apr 18 '25 14:04 mamoodi

I have not experienced the issue when using a single worker, and the likelihood of encountering the problem seems to be proportional to the number of workers.

Yes - this is a known issue with local docker when you try to start multiple dockers al lat once.

-- For other cases, i'd suggest launch the same command with DEBUG=1 and share the full traces. I suspect it might be some platform in-compatibility (e.g., running this command on a mac)

xingyaoww avatar Apr 19 '25 16:04 xingyaoww

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar May 20 '25 02:05 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar May 28 '25 02:05 github-actions[bot]