Fix tmux zombie processes when using Docker Runtime
I've been noticing that OpenHands generates a lot of zombie processes in its container. It looks like this got reported here back in August 2025: https://github.com/OpenHands/OpenHands/issues/10189
And then the OP submitted a PR for the issue here (also in August 2025): https://github.com/OpenHands/OpenHands/pull/10190
The PR did not get reviewed and merged and was eventually closed. I strongly encourage the team to look into this issue because it causes the agent to eventually get stuck. So far, I've had to completely restart the runtime when this happens.
Update: I've been running #10190 locally and it's not a perfect fix, either. There are cases in which a command with a lot of parts to it completes successfully, but OpenHands doesn't recognize that the command is done. I'm also still seeing a lot of zombie processes over time.
It looks like the solution to this issue might be as simple as changing these lines in openhands.runtime.impl.docker.docker_runtime.DockerRuntime.init_container:
self.container = self.docker_client.containers.run(
self.runtime_container_image,
command=command,
# Override the default 'bash' entrypoint because the command is a binary.
entrypoint=[],
network_mode=network_mode,
ports=port_mapping,
working_dir='/openhands/code/', # do not change this!
name=self.container_name,
detach=True,
environment=environment,
volumes=volumes, # type: ignore
mounts=overlay_mounts, # type: ignore
device_requests=device_requests,
**(self.config.sandbox.docker_runtime_kwargs or {}),
)
To:
self.container = self.docker_client.containers.run(
self.runtime_container_image,
# Use Docker's tini init process to ensure proper signal handling and reaping of
# zombie child processes.
init=True,
command=command,
# Override the default 'bash' entrypoint because the command is a binary.
entrypoint=[],
network_mode=network_mode,
ports=port_mapping,
working_dir='/openhands/code/', # do not change this!
name=self.container_name,
detach=True,
environment=environment,
volumes=volumes, # type: ignore
mounts=overlay_mounts, # type: ignore
device_requests=device_requests,
**(self.config.sandbox.docker_runtime_kwargs or {}),
)
Why does this work? Well, the problem is that micromamba was becoming PID 1 in the runtime container, but micromamba doesn't handle signals properly for an init process. Specifically, an init process is supposed to reap zombie processes under its children.
Right now I'm running with this setting on and it seems to be working! Before, PID 1 in the container showed as:
root 1 0.0 0.0 104916 10584 ? Ssl 04:00 0:00 /openhands/micromamba/bin/micromamba run -n openhands poetry run python -u -m openhands.runtime.action_execution_server 3004...
Now it shows as:
root 1 0.0 0.0 992 696 ? Ss 05:48 0:00 /sbin/docker-init -- /openhands/micromamba/bin/micromamba run -n openhands poetry run python -u -m openhands.runtime.action...
i tried using tini to create an init process within the runtime to allow proper reaping, but the 1.x refactor was close to the horizon and the team was focused on that at the time. correspondingly it didn’t seem worthwhile to attempt such contortions until things stabilized a bit… i suspect that’s the problem though… micromamba isn’t able to properly reap zombies (or wasn’t last i checked)
@wolfspyre That makes sense. For what it's worth, init: True is working really well for me!