WAIT_FOR_DOCKER is not exiting after timeout
Checks
- [X] I've already read https://github.com/actions-runner-controller/actions-runner-controller/blob/master/TROUBLESHOOTING.md and I'm sure my issue is not covered in the troubleshooting guide.
Controller Version
NA
Helm Chart Version
NA
CertManager Version
NA
Deployment Method
Helm
cert-manager installation
yes
Checks
- [X] This isn't a question or user support case (For Q&A and community support, go to Discussions. It might also be a good idea to contract with any of contributors and maintainers if your business is so critical and therefore you need priority support
- [X] I've read releasenotes before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
- [X] My actions-runner-controller version (v0.x.y) does support the feature
- [X] I've already upgraded ARC (including the CRDs, see charts/actions-runner-controller/docs/UPGRADING.md for details) to the latest and it didn't fix the issue
Resource Definitions
NA
To Reproduce
see description
Describe the bug
Sometimes, the Docker container is not coming up (20.10.17-dind-alpine3.16@sha256:e25a101eb5ee4bc8772e862e908a33a133feb067a6d0d4a19cb7753d64596889). The runner is waiting 2 minutes, but then continues and picks up a job even Docker container is still not there. And then we see this
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
in the Github Workflow logs.

Could it be, that there is some exit missing in the entrypoint script of the Runner?
if [[ "${DISABLE_WAIT_FOR_DOCKER}" != "true" ]] && [[ "${DOCKER_ENABLED}" == "true" ]]; then
log.debug 'Docker enabled runner detected and Docker daemon wait is enabled'
log.debug 'Waiting until Docker is available or the timeout is reached'
timeout 120s bash -c 'until docker ps ;do sleep 1; done'
https://github.com/actions-runner-controller/actions-runner-controller/blob/11cb9b78829f8640ceb3bcb677e5d608dc3299ea/runner/entrypoint.sh
Describe the expected behavior
Runner container should not pick a job when Docker is not started. Ideally, K8s would kill that pod.
Controller Logs
NA
Runner Pod Logs
Runner pod log: https://gist.github.com/erichorwath/26be5fb65eb98b42a6b3eb868a27c3e0
Workflow log: https://gist.github.com/erichorwath/6a3fd5a976dc75f34e8e40e853a6b4cf
Additional Context
No response
@erichorwath Thanks for reporting! Good catch... Sounds like you're correct. Would you mind modifying it to timeout 120s bash -c 'until docker ps ;do sleep 1; done' || exit 1 and confirm if it works?
Hey @mumoshu, @erichorwath; was there any confirmation that this update to the entrypoint.sh was a good solution?
I have not tested it yet. Would you mind giving it a shot if you are affected by the said issue? Thanks!
We are also facing same issue.
@GopikaV24 Thanks for reporting! Would you mind trying the proposed fix by building a custom runner image, and submit a PR if it works?
Sure..