actions-runner-controller
actions-runner-controller copied to clipboard
resources specification is not working
Checks
- [X] I've already read https://github.com/actions/actions-runner-controller/blob/master/TROUBLESHOOTING.md and I'm sure my issue is not covered in the troubleshooting guide.
- [X] I'm not using a custom entrypoint in my runner image
Controller Version
v1.26.7
Helm Chart Version
0.23.4
CertManager Version
No response
Deployment Method
Helm
cert-manager installation
I followed the installation process as the documentatio said https://github.com/actions/actions-runner-controller/blob/master/docs/installing-arc.md
Checks
- [X] This isn't a question or user support case (For Q&A and community support, go to Discussions. It might also be a good idea to contract with any of contributors and maintainers if your business is so critical and therefore you need priority support
- [X] I've read releasenotes before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
- [X] My actions-runner-controller version (v0.x.y) does support the feature
- [X] I've already upgraded ARC (including the CRDs, see charts/actions-runner-controller/docs/UPGRADING.md for details) to the latest and it didn't fix the issue
- [X] I've migrated to the workflow job webhook event (if you using webhook driven scaling)
Resource Definitions
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: runner-deployment
namespace: my-runners
spec:
replicas: 1
template:
spec:
resources:
limits:
cpu: "2"
memory: "5Gi"
requests:
cpu: "1"
memory: "4Gi"
dockerMTU: 1400
env:
- name: ARC_DOCKER_MTU_PROPAGATION
value: "true"
githubAPICredentialsFrom:
secretRef:
name: controller-manager-my-runners
- name: docker-secret
secret:
secretName: docker-auth
items:
- key: .dockerconfigjson
path: config.json
organization: my-org
labels:
- testing-new-k8s
containers:
- name: runner
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "1"
memory: "2Gi"
securityContext:
privileged: true
volumeMounts:
- name: docker-secret
mountPath: "/home/runner/.docker/"
readOnly: true
- name: docker
resources:
limits:
cpu: "3"
memory: "8Gi"
requests:
cpu: "2"
memory: "5Gi"
securityContext:
privileged: true
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: runner-deployment-autoscaler
namespace: my-runners
spec:
githubAPICredentialsFrom:
secretRef:
name: controller-manager-my-runners
scaleTargetRef:
name: runner-deployment
minReplicas: 20
maxReplicas: 30
To Reproduce
1. kubectl apply -f self-hosted-runner.yml -n my-runners
2. once the runners are created go to the runner/docker container
3. running lscpu and free -g, its showing the total amount from the server, ignoring the specifications that I set ton the resource definition file (self-hosted-runner.yml)
Describe the bug
My k8s cluster have 4 nodes, each node with 20 cpus + 64gb of memory, I should be able to run at least 10 runners at the same time but that is not the case since the jobs got cancel from nowhere on the "Initialize container" step
930778163b2d: Verifying Checksum
930778163b2d: Download complete
fc551ec0b9d5: Verifying Checksum
fc551ec0b9d5: Download complete
Error: The operation was canceled.
Describe the expected behavior
each pod should show the cpu and memory that I specify on the resource definition file
Whole Controller Logs
2023-09-01T15:21:40Z DEBUG runner Runner appears to have been registered and running. {"runner": "***-runners/***-runner-deployment-hjwsp-vcqt2", "podCreationTimestamp": "2023-09-01 15:21:36 +0000 UTC"}
2023-09-01T15:21:40Z DEBUG runner Runner appears to have been registered and running. {"runner": "***-runners/***-runner-deployment-hjwsp-66568", "podCreationTimestamp": "2023-09-01 15:21:36 +0000 UTC"}
2023-09-01T15:21:40Z DEBUG runner Runner appears to have been registered and running. {"runner": "***-runners/***-runner-deployment-hjwsp-cbbbn", "podCreationTimestamp": "2023-09-01 15:21:36 +0000 UTC"}
2023-09-01T15:21:40Z DEBUG runner Runner appears to have been registered and running. {"runner": "***-runners/***-runner-deployment-hjwsp-m27s8", "podCreationTimestamp": "2023-09-01 15:21:37 +0000 UTC"}
2023-09-01T15:21:40Z DEBUG runner Runner appears to have been registered and running. {"runner": "***-runners/***-runner-deployment-hjwsp-fw98k", "podCreationTimestamp": "2023-09-01 15:21:36 +0000 UTC"}
2023-09-01T15:21:41Z DEBUG runner Runner appears to have been registered and running. {"runner": "***-runners/***-runner-deployment-hjwsp-t6xs2", "podCreationTimestamp": "2023-09-01 15:21:36 +0000 UTC"}
2023-09-01T15:22:35Z DEBUG horizontalrunnerautoscaler Calculated desired replicas of 20 {"horizontalrunnerautoscaler": "***-runners/***-runner-deployment-autoscaler", "suggested": 20, "reserved": 0, "min": 20, "max": 30, "last_scale_up_time": "2023-09-01 15:21:24 +0000 UTC", "scale_down_delay_until": "2023-09-01T15:31:24Z"}
2023-09-01T15:23:38Z DEBUG horizontalrunnerautoscaler Calculated desired replicas of 20 {"horizontalrunnerautoscaler": "***-runners/***-runner-deployment-autoscaler", "suggested": 20, "reserved": 0, "min": 20, "max": 30, "last_scale_up_time": "2023-09-01 15:21:24 +0000 UTC", "scale_down_delay_until": "2023-09-01T15:31:24Z"}
2023-09-01T15:23:47Z INFO runner Removed finalizer {"runner": "***-runners/***-runner-deployment-hjwsp-vcqt2"}
2023-09-01T15:23:47Z DEBUG runnerreplicaset Created replica(s) {"runnerreplicaset": "***-runners/***-runner-deployment-hjwsp", "lastSyncTime": "2023-09-01T15:21:34Z", "effectiveTime": "<nil>", "templateHashDesired": "56f59b8797", "replicasDesired": 20, "replicasPending": 2, "replicasRunning": 17, "replicasMaybeRunning": 19, "templateHashObserved": ["56f59b8797"], "created": 1}
2023-09-01T15:23:47Z DEBUG runnerreplicaset Skipped reconcilation because owner is not synced yet {"runnerreplicaset": "***-runners/***-runner-deployment-hjwsp", "owner": "***-runners/***-runner-deployment-hjwsp-2fntb", "pods": null}
2023-09-01T15:23:47Z DEBUG runnerreplicaset Skipped reconcilation because owner is not synced yet {"runnerreplicaset": "***-runners/***-runner-deployment-hjwsp", "owner": "***-runners/***-runner-deployment-hjwsp-2fntb", "pods": null}
2023-09-01T15:23:47Z DEBUG runnerreplicaset Skipped reconcilation because owner is not synced yet {"runnerreplicaset": "***-runners/***-runner-deployment-hjwsp", "owner": "***-runners/***-runner-deployment-hjwsp-2fntb", "pods": null}
2023-09-01T15:23:47Z INFO runnerpod Runner pod has been stopped with a successful status. {"runnerpod": "***-runners/***-runner-deployment-hjwsp-vcqt2"}
2023-09-01T15:23:48Z INFO runner Updated registration token {"runner": "***-runner-deployment-hjwsp-2fntb", "repository": ""}
Note: I censored part of pod's name
Whole Runner Pod Logs
n/a
Additional Context
I also tried to set up the resources
inside the containers without any success on the behavior, such as
....
containers:
- name: runner
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "1"
memory: "2Gi"
securityContext:
privileged: true
volumeMounts:
- name: docker-secret
mountPath: "/home/runner/.docker/"
readOnly: true
- name: docker
resources:
limits:
cpu: "3"
memory: "8Gi"
requests:
cpu: "2"
memory: "5Gi"
securityContext:
privileged: true
...