actions-runner-controller
actions-runner-controller copied to clipboard
actions runner pods error
Checks
- [X] I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- [X] I am using charts that are officially provided
Controller Version
actions-runner-controller-0.22.0
Deployment Method
Helm
Checks
- [X] This isn't a question or user support case (For Q&A and community support, go to Discussions).
- [X] I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
install controller
configure runnerdeployment with replica 10
configure horizontalscaler 10 to 50 replica
Describe the bug
runner pods going to error state and throwing docker socket error
Describe the expected behavior
it should spin up the new runners based on the no of workflow triggered, in this case runner scaler configured with max 50
Additional Context
COMPUTED VALUES:
actionsMetrics:
port: 8443
proxy:
enabled: true
image:
repository: quay.io/brancz/kube-rbac-proxy
tag: v0.13.1
serviceAnnotations: {}
serviceMonitor: false
serviceMonitorLabels: {}
actionsMetricsServer:
affinity: {}
enabled: false
fullnameOverride: ""
imagePullSecrets: []
ingress:
annotations: {}
enabled: false
hosts:
- extraPaths: []
host: chart-example.local
paths: []
ingressClassName: ""
tls: []
logFormat: text
nameOverride: ""
nodeSelector: {}
podAnnotations: {}
podLabels: {}
podSecurityContext: {}
priorityClassName: ""
replicaCount: 1
resources: {}
secret:
create: false
enabled: false
github_webhook_secret_token: ""
name: actions-metrics-server
securityContext: {}
service:
annotations: {}
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
type: ClusterIP
serviceAccount:
annotations: {}
create: true
name: ""
tolerations: []
additionalVolumeMounts: []
additionalVolumes: []
admissionWebHooks: {}
affinity: {}
authSecret:
annotations: {}
create: true
enabled: true
github_token:
name: controller-manager
certManagerEnabled: true
defaultScaleDownDelay: 10m
dockerRegistryMirror: ""
enableLeaderElection: true
env: {}
fullnameOverride: ""
githubWebhookServer:
affinity: {}
enabled: false
fullnameOverride: ""
imagePullSecrets: []
ingress:
annotations: {}
enabled: false
hosts:
- extraPaths: []
host: chart-example.local
paths: []
ingressClassName: ""
tls: []
logFormat: text
nameOverride: ""
nodeSelector: {}
podAnnotations: {}
podDisruptionBudget:
enabled: false
podLabels: {}
podSecurityContext: {}
priorityClassName: ""
replicaCount: 1
resources: {}
secret:
create: false
enabled: false
github_webhook_secret_token: ""
name: github-webhook-server
securityContext: {}
service:
annotations: {}
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
type: ClusterIP
serviceAccount:
annotations: {}
create: true
name: ""
tolerations: []
useRunnerGroupsVisibility: false
image:
actionsRunnerImagePullSecrets: []
actionsRunnerRepositoryAndTag: summerwind/actions-runner:latest
dindSidecarRepositoryAndTag: docker:24.0.7-dind-alpine3.18
pullPolicy: IfNotPresent
repository: summerwind/actions-runner-controller
imagePullSecrets: []
labels: {}
logFormat: text
metrics:
port: 8443
proxy:
enabled: true
image:
repository: quay.io/brancz/kube-rbac-proxy
tag: v0.13.1
serviceAnnotations: {}
serviceMonitor: false
serviceMonitorLabels: {}
nameOverride: ""
nodeSelector: {}
podAnnotations: {}
podDisruptionBudget:
enabled: false
podLabels: {}
podSecurityContext: {}
priorityClassName: ""
rbac: {}
replicaCount: 1
resources: {}
runner:
statusUpdateHook:
enabled: false
scope:
singleNamespace: false
watchNamespace: ""
securityContext: {}
service:
annotations: {}
port: 443
type: ClusterIP
serviceAccount:
annotations: {}
create: true
name: ""
syncPeriod: 1m
tolerations: []
webhookPort: 9443
Controller Logs
2024-03-19T22:21:12Z DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "33222d85-db32-4f05-b866-858ddb83a913", "allowed": true}
2024-03-19T22:21:12Z INFO runner Created runner pod {"runner": "actions-runner-systems/github-action-small-5r8nj-29dt6", "repository": ""}
2024-03-19T22:21:12Z DEBUG events Created pod 'github-action-small-5r8nj-29dt6' {"type": "Normal", "object": {"kind":"Runner","namespace":"actions-runner-systems","name":"github-action-small-5r8nj-29dt6","uid":"595fdda6-8727-496d-9a9e-ec221d8e9e5a","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"67959395"}, "reason": "PodCreated"}
2024-03-19T22:21:12Z DEBUG runnerreplicaset Skipped reconcilation because owner is not synced yet {"runnerreplicaset": "actions-runner-systems/github-action-small-5r8nj", "owner": "actions-runner-systems/github-action-small-5r8nj-29dt6", "pods": [{"kind":"Pod","apiVersion":"v1","metadata":{"name":"github-action-small-5r8nj-29dt6","namespace":"actions-runner-systems","uid":"c9d9e756-7f3c-450c-b5c3-578dc6eb462e","resourceVersion":"67959403","creationTimestamp":"2024-03-19T22:21:12Z","labels":{"actions-runner":"","actions-runner-controller/inject-registration-token":"true","pod-template-hash":"749c9b7998","runner-deployment-name":"github-action-small","runner-template-hash":"78ddc6dd8"},"annotations":{"actions-runner-controller/token-expires-at":"2024-03-19T16:10:51-07:00","sync-time":"2024-03-19T22:21:11Z"},"ownerReferences":[{"apiVersion":"actions.summerwind.dev/v1alpha1","kind":"Runner","name":"github-action-small-5r8nj-29dt6","uid":"595fdda6-8727-496d-9a9e-ec221d8e9e5a","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"manager","operation":"Update","apiVersion":"v1","time":"2024-03-19T22:21:12Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:sync-time":{}},"f:labels":{".":{},"f:actions-runner":{},"f:actions-runner-controller/inject-registration-token":{},"f:pod-template-hash":{},"f:runner-deployment-name":{},"f:runner-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"595fdda6-8727-496d-9a9e-ec221d8e9e5a\"}":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"docker\"}":{".":{},"f:env":{".":{},"k:{\"name\":\"DOCKER_TLS_CERTDIR\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:lifecycle":{".":{},"f:preStop":{".":{},"f:exec":{".":{},"f:command":{}}}},"f:name":{},"f:resources":{},"f:securityContext":{".":{},"f:privileged":{}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/certs/client\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner/_work\"}":{".":{},"f:mountPath":{},"f:name":{}}}},"k:{\"name\":\"runner\"}":{".":{},"f:env":{".":{},"k:{\"name\":\"DOCKERD_IN_RUNNER\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_CERT_PATH\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_ENABLED\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_HOST\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_TLS_VERIFY\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"GITHUB_URL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_ENTERPRISE\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_EPHEMERAL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_GROUP\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_LABELS\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_NAME\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_ORG\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_REPO\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_STATUS_UPDATE_HOOK\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_TOKEN\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_WORKDIR\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{".":{},"f:limits":{".":{},"f:cpu":{},"f:memory":{}},"f:requests":{".":{},"f:cpu":{},"f:memory":{}}},"f:securityContext":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/certs/client\"}":{".":{},"f:mountPath":{},"f:name":{},"f:readOnly":{}},"k:{\"mountPath\":\"/runner\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner/_work\"}":{".":{},"f:mountPath":{},"f:name":{}}}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{},"f:volumes":{".":{},"k:{\"name\":\"certs-client\"}":{".":{},"f:emptyDir":{},"f:name":{}},"k:{\"name\":\"runner\"}":{".":{},"f:emptyDir":{},"f:name":{}},"k:{\"name\":\"work\"}":{".":{},"f:emptyDir":{},"f:name":{}}}}}}]},"spec":{"volumes":[{"name":"runner","emptyDir":{}},{"name":"work","emptyDir":{}},{"name":"certs-client","emptyDir":{}},{"name":"kube-api-access-ldpcz","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"runner","image":"summerwind/actions-runner:latest","env":[{"name":"RUNNER_ORG","value":"prosperllc"},{"name":"RUNNER_REPO"},{"name":"RUNNER_ENTERPRISE"},{"name":"RUNNER_LABELS","value":"pspr-utils-linux-np-small"},{"name":"RUNNER_GROUP"},{"name":"DOCKER_ENABLED","value":"true"},{"name":"DOCKERD_IN_RUNNER","value":"false"},{"name":"GITHUB_URL","value":"https://github.com/"},{"name":"RUNNER_WORKDIR","value":"/runner/_work"},{"name":"RUNNER_EPHEMERAL","value":"true"},{"name":"RUNNER_STATUS_UPDATE_HOOK","value":"false"},{"name":"GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT","value":"actions-runner-controller/v0.27.0"},{"name":"DOCKER_HOST","value":"tcp://localhost:2376"},{"name":"DOCKER_TLS_VERIFY","value":"1"},{"name":"DOCKER_CERT_PATH","value":"/certs/client"},{"name":"RUNNER_NAME","value":"github-action-small-5r8nj-29dt6"},{"name":"RUNNER_TOKEN","value":"AI5PDGHSSR55YFZJEKLE44DF7INXW"}],"resources":{"limits":{"cpu":"2","memory":"4Gi"},"requests":{"cpu":"500m","memory":"2Gi"}},"volumeMounts":[{"name":"runner","mountPath":"/runner"},{"name":"work","mountPath":"/runner/_work"},{"name":"certs-client","readOnly":true,"mountPath":"/certs/client"},{"name":"kube-api-access-ldpcz","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"Always","securityContext":{}},{"name":"docker","image":"docker:24.0.7-dind-alpine3.18","env":[{"name":"DOCKER_TLS_CERTDIR","value":"/certs"}],"resources":{},"volumeMounts":[{"name":"runner","mountPath":"/runner"},{"name":"certs-client","mountPath":"/certs/client"},{"name":"work","mountPath":"/runner/_work"},{"name":"kube-api-access-ldpcz","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"lifecycle":{"preStop":{"exec":{"command":["/bin/sh","-c","timeout \"${RUNNER_GRACEFUL_STOP_TIMEOUT:-15}\" /bin/sh -c \"echo 'Prestop hook started'; while [ -f /runner/.runner ]; do sleep 1; done; echo 'Waiting for dockerd to start'; while ! pgrep -x dockerd; do sleep 1; done; echo 'Prestop hook stopped'\" >/proc/1/fd/1 2>&1"]}}},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent","securityContext":{"privileged":true}}],"restartPolicy":"Never","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"gke-nonprod-us-west1-default-node-poo-504a47a4-8he7","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","conditions":[{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2024-03-19T22:21:12Z"}],"qosClass":"Burstable"}}]}
2024-03-19T22:21:50Z DEBUG runner Runner appears to have been registered and running. {"runner": "actions-runner-systems/github-action-small-5r8nj-lhd8g", "podCreationTimestamp": "2024-03-19 22:21:11 +0000 UTC"}
2024-03-19T22:21:50Z DEBUG runner Runner appears to have been registered and running. {"runner": "actions-runner-systems/github-action-small-5r8nj-flq4v", "podCreationTimestamp": "2024-03-19 22:21:12 +0000 UTC"}
Runner Pod Logs
# Authentication
√ Connected to GitHub
# Runner Registration
√ Runner successfully added
√ Runner connection is good
# Runner settings
√ Settings Saved.
2024-03-19 22:13:18.165 DEBUG --- Runner successfully configured.
{
"agentId": 88824,
"agentName": "github-action-small-5r8nj-hv2gf",
"poolId": 1,
"poolName": "Default",
"ephemeral": true,
"serverUrl": "https://pipelinesghubeus21.actions.githubusercontent.com/tMTkzAKYleoidiHAI9FjPaHPkEkp2s7TIoUW3BW1740YmeFlFo/",
"gitHubUrl": "https://github.com/prosperllc",
"workFolder": "/runner/_work"
2024-03-19 22:13:18.174 DEBUG --- Docker enabled runner detected and Docker daemon wait is enabled
2024-03-19 22:13:18.177 DEBUG --- Waiting until Docker is available or the timeout of 120 seconds is reached
Failed to initialize: unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory
Failed to initialize: unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory
Failed to initialize: unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory
Failed to initialize: unable to resolve docker endpoint: open /certs/client/ca.pem: no such file or directory
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
Cannot connect to the Docker daemon at tcp://localhost:2376. Is the docker daemon running?
we have 2 cluster configured with actions runners controller
- first cluster never had any issues - 1.25.16-gke.1460000
- second cluster always running with the above issue . both cluster same helm chart version used . the only difference is kubernetes version- 1.27.9-gke.1092000
is there any specific controller version need to use or runner image or any other configuration changes required ?
any suggestions on the above case?
HI Team ,
any suggestions on the above issue?
Hi Team any suggestions on the above issue , i see few of users raised similar issue
Thanks Sridhar