OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: Container Startup Failure When Running Multiple OpenHands Tasks Concurrently

Open Hambaobao opened this issue 10 months ago • 2 comments

Is there an existing issue for the same bug?

  • [x] I have checked the existing issues.

Describe the bug and reproduction steps

Description

When running multiple OpenHands tasks concurrently, some tasks fail with a container startup error. Running tasks individually does not produce this issue.

Reproduce

with ProcessPoolExecutor(max_workers=args.max_workers) as executor:
    # use subprocess.run() to run openhands
    # cmd: python -m openhands.core.main some arguments

OpenHands Installation

Development workflow

OpenHands Version

0.23.0

Operating System

Linux

Logs, Errors, Screenshots, and Additional Context

[92m13:14:53 - openhands:INFO[0m: docker_runtime.py:139 - [runtime mcfletch__--__pyopengl-2156d576f195d9f0] Starting runtime with image: build-env:base-env-v0.1.0
[92m13:15:53 - openhands:ERROR[0m: docker_runtime.py:297 - [runtime mcfletch__--__pyopengl-2156d576f195d9f0] Error: Instance openhands-runtime-mcfletch__--__pyopengl-2156d576f195d9f0 FAILED to start container!

[92m13:15:53 - openhands:ERROR[0m: docker_runtime.py:301 - [runtime mcfletch__--__pyopengl-2156d576f195d9f0] UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
ERROR:root:  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/core/main.py", line 276, in <module>
    asyncio.run(
  File "/root/miniconda3/envs/build-env/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/core/main.py", line 99, in run_controller
    await runtime.connect()
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/runtime/impl/docker/docker_runtime.py", line 142, in connect
    await call_sync_from_async(self._init_container)
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/utils/async_utils.py", line 18, in call_sync_from_async
    result = await coro
             ^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/utils/async_utils.py", line 17, in <lambda>
    coro = loop.run_in_executor(None, lambda: fn(*args, **kwargs))
                                              ^^^^^^^^^^^^^^^^^^^
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/runtime/impl/docker/docker_runtime.py", line 303, in _init_container
    raise e
  File "/root/bofeng.zl/workspace/build-env/submodules/OpenHands-0.23.0/openhands/runtime/impl/docker/docker_runtime.py", line 261, in _init_container
    self.container = self.docker_client.containers.run(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/models/containers.py", line 876, in run
    container = self.create(image=image, command=command,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/models/containers.py", line 935, in create
    resp = self.client.api.create_container(**create_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/api/container.py", line 440, in create_container
    return self.create_container_from_config(config, name, platform)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/api/container.py", line 456, in create_container_from_config
    res = self._post_json(u, data=config, params=params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/api/client.py", line 303, in _post_json
    return self._post(url, data=json.dumps(data2), **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/utils/decorators.py", line 44, in inner
    return f(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/docker/api/client.py", line 242, in _post
    return self.post(url, **self._set_request_timeout(kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/build-env/lib/python3.12/site-packages/requests/adapters.py", line 713, in send
    raise ReadTimeout(e, request=request)

ERROR:root:<class 'requests.exceptions.ReadTimeout'>: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

Hambaobao avatar Feb 28 '25 05:02 Hambaobao

Does this happen in the latest 0.27 version?

mamoodi avatar Feb 28 '25 14:02 mamoodi

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Mar 31 '25 02:03 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Apr 08 '25 02:04 github-actions[bot]

Yes this still happens with 0.48.

wyc1997 avatar Jul 15 '25 06:07 wyc1997