actions-runner-controller
actions-runner-controller copied to clipboard
Incorrect reporting of histogram type metrics from listener pod
Checks
- [X] I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- [X] I am using charts that are officially provided
Controller Version
0.7.0
Deployment Method
Helm
Checks
- [X] This isn't a question or user support case (For Q&A and community support, go to Discussions).
- [X] I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
- Install controller chart (see controller.yaml attached below)
NAMESPACE="arc-systems"
helm install arc \
--namespace "${NAMESPACE}" \
--create-namespace \
-f controller.yaml \
--version "0.7.0" \
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller
- Install runner chart (see runner-set.yaml attached below)
INSTALLATION_NAME="arc-runner-set"
NAMESPACE="arc-runners"
helm install "${INSTALLATION_NAME}" \
--namespace "${NAMESPACE}" \
--create-namespace \
-f runner-set.yaml \
--version "0.7.0" \
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set
- Once the controller/listener/runner pods are in a ready state, schedule a job from GH and wait for completion
Here is the workflow code that was used for the test:
name: ARC test
on:
workflow_dispatch:
inputs:
sleepTime:
description: "Seconds to sleep for"
default: 2
jobs:
print:
runs-on: "test-runner-set"
container:
image: docker.io/busybox:latest
steps:
- run: sleep ${{ github.event.inputs.sleepTime }}
Note: A potentially important detail is that the organisation that I work at uses Github Enterprise Server (i.e. during testing the job was sourced from an instance of Github Enterprise Server)
- Open another terminal window and port-forward to the metrics port of the listener
kubectl port-forward <listener-pod-name> <your-local-port>:8080
- Observe the emitted metrics
While the port-forward
command is running, make a http request to the metrics endpoint through e.g. curl:
curl http://localhost:<your-local-port>/metrics
- If you schedule more jobs to this runner-set (using the same workflow code) with varying execution times you should notice that all metrics ending with
_bucket
get incremented regardless of actual execution duration.
Describe the bug
The listener pods output histograms with frequency buckets as per the syntax of prometheus exposition formats. However, the values assigned to those buckets seem to be incorrect. It appears that each bucket gets incremented regardless of the actual job execution/startup time.
After the runner-set completes a single job with execution duration of 2 seconds, the listener pod metric endpoint outputs the following:
# HELP gha_job_execution_duration_seconds Time spent executing workflow jobs by the scale set (in seconds).
# TYPE gha_job_execution_duration_seconds histogram
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.01"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.05"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.1"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.5"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="1"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="2"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="3"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="4"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="5"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="6"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="7"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="8"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="9"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="10"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="12"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="15"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="18"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="20"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="25"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="30"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="40"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="50"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="60"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="70"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="80"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="90"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="100"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="110"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="120"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="150"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="180"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="210"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="240"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="300"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="360"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="420"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="480"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="540"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="600"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="900"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="1200"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="1800"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="2400"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="3000"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="3600"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="+Inf"} 1
gha_job_execution_duration_seconds_sum{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository=""} 0
gha_job_execution_duration_seconds_count{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository=""} 1
The case is also the same for the startup duration histogram:
# HELP gha_job_startup_duration_seconds Time spent waiting for workflow job to get started on the runner owned by the scale set (in seconds).
# TYPE gha_job_startup_duration_seconds histogram
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="0.01"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="0.05"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="0.1"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="0.5"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="1"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="2"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="3"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="4"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="5"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="6"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="7"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="8"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="9"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="10"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="12"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="15"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="18"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="20"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="25"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="30"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="40"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="50"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="60"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="70"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="80"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="90"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="100"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="110"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="120"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="150"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="180"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="210"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="240"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="300"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="360"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="420"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="480"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="540"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="600"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="900"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="1200"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="1800"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="2400"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="3000"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="3600"} 1
gha_job_startup_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository="",le="+Inf"} 1
gha_job_startup_duration_seconds_sum{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository=""} 0
gha_job_startup_duration_seconds_count{enterprise="<your-enterprise>",event_name="workflow_dispatch",organization="",repository=""} 1
Describe the expected behavior
Only the right buckets should be incremented per job execution that corresponds to the execution/startup time of the job.
Example:
On a new listener pod, after a completion of a single job with execution time of 2 seconds, the emitted buckets metrics should be as follows:
# HELP gha_job_execution_duration_seconds Time spent executing workflow jobs by the scale set (in seconds).
# TYPE gha_job_execution_duration_seconds histogram
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.01"} 0
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.05"} 0
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.1"} 0
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="0.5"} 0
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="1"} 0
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="2"} 1 <------ <Expected increment start>
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="3"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="4"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="5"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="6"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="7"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="8"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="9"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="10"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="12"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="15"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="18"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="20"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="25"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="30"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="40"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="50"} 1
gha_job_execution_duration_seconds_bucket{enterprise="<your-enterprise>",event_name="workflow_dispatch",job_name="print",job_result="succeeded",job_workflow_ref="<your-workflow-ref>",organization="",repository="",le="60"} 1
.... (redacted for brevity)
Additional Context
## controller.yaml
metrics:
controllerManagerAddr: ":8080"
listenerAddr: ":8080"
listenerEndpoint: "/metrics"
## runner-set.yaml
runnerScaleSetName: test-runner-set
githubConfigUrl: "https://<your-github-host>/enterprises/<your-enterprise>"
githubConfigSecret:
github_token: <gh-token>
minRunners: 1
maxRunners: 1
containerMode:
type: "dind"
Controller Logs
https://gist.github.com/1MaxKoval/7c875cc4486810e444e5b23b22512802
p.s. you can also find the listener logs and Prometheus endpoint output there
Runner Pod Logs
https://gist.github.com/1MaxKoval/8b74b22d4689c5906ff290a436ced41b
Hello! Thank you for filing an issue.
The maintainers will triage your issue shortly.
In the meantime, please take a look at the troubleshooting guide for bug reports.
If this is a feature request, please review our contribution guidelines.
For context, I noted a similar issue (in my comment here)
- https://github.com/actions/actions-runner-controller/pull/3003#issuecomment-1846138389
I haven't yet confirmed though if what you're getting here is what I'm also seeing, and whether that's what's causing me being surprised by too high cardinality on gha_job_execution_duration_seconds
🤷
Hey, are you by any chance running the enterprise server 3.9?
@nikola-jokic I'm still seeing this exact issue with the controller and scale set version 0.9.1 on GHE 3.10.3.
I dug into it a bit and I can't find these time fields being emitted anywhere: https://github.com/actions/actions-runner-controller/blob/a1b8e0cc3d280cfae73a4c1dc24dc49da371d1d1/github/actions/types.go#L66-L69
Are they supposed to be in the EphemeralRunnerStatus
?
https://github.com/actions/actions-runner-controller/blob/a1b8e0cc3d280cfae73a4c1dc24dc49da371d1d1/apis/actions.github.com/v1alpha1/ephemeralrunner_types.go#L76-L124
The EphemeralRunnerStatus
doesn't seem to get updated for the JobCompleted
case.
https://github.com/actions/actions-runner-controller/blob/a1b8e0cc3d280cfae73a4c1dc24dc49da371d1d1/cmd/githubrunnerscalesetlistener/autoScalerService.go#L163-L193
I'm not familiar with the code base so I might be completely wrong though.
For context, we are migrating from runner deployments to autoscaling runner sets. These histogram metrics would be very helpful for replacing our custom run duration metrics computed from the GitHub APIs.
Hey @chotiwat,
Starting from 3.11, the histogram metrics are available. Otherwise, these fields are not communicated to the scale set, so they are always set to 0. I have created a docs PR to document this behavior.
Thank you all for raising this issue. Docs updates are in so I will close it now :relaxed:. Sorry this hasn't been documented before :disappointed: