pipeline
pipeline copied to clipboard
add namespace label/tag to non-deprecated throttle metrics
Changes
Back when implementing https://github.com/tektoncd/pipeline/pull/6744 for https://github.com/tektoncd/pipeline/issues/6631 we failed to realize that with k8s quota policies being namespace scoped, knowing which namespace the throttled items were in could have some diagnostic value.
Now that we have been using the metric added for a bit, this realization is now very apparent.
This changes introduces the namespace tag. Also, since last touching this space, the original metric was deprecated and a new one with a shorter name was added. This change only updates the non-deprecated metric with the new label.
Lastly, the default behavior is preserved, and use of the new label only occurs when explicitly enabled in observability config map.
Fixes https://github.com/tektoncd/pipeline/issues/7878
/kind feature
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
- [/ ] Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
- [/ ] Has Tests included if any functionality added or changed
- [ /] pre-commit Passed
- [ /] Follows the commit message standard
- [ /] Meets the Tekton contributor standards (including functionality, content, code)
- [ /] Has a kind label. You can add one by adding a comment on this PR that contains
/kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep - [ /] Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
- [ /] Release notes contains the string "action required" if the change requires additional action from users switching to the new release
Release Notes
Add 'namespace' label/tag to the 'tekton_pipelines_controller_running_taskruns_throttled_by_quota' and 'tekton_pipelines_controller_running_taskruns_throttled_by_node' metrics, as kubernetes quota definitions are namespace scoped, hence certain namespaces may be more susceptible to quota throttling than others, and in a multi-node environment, not all namespaces are necessarily on the same node.
To enable this new label/tag, set 'metrics.taskrun.throttle.enable-namespace' to 'true' in the 'config-observability' ConfigMap
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/taskrunmetrics/metrics.go | 84.8% | 86.3% | 1.5 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/taskrunmetrics/metrics.go | 84.8% | 86.3% | 1.5 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.5% | 0.7 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.5% | 0.7 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.5% | 0.7 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.5% | 0.7 |
ping @khrm - can you take a look when you get the chance?
also, see my question in the description wrt if adding labels to existing experimental metrics is "OK"
We can add this behind a flag also. Like we do for
count.reason
Ah you mean https://github.com/tektoncd/pipeline/blob/main/docs/metrics.md#configuring-metrics-using-config-observability-configmap
Before I make the change, how about an early review on the name .... say
metrics.taskrun.throttle.enable-namespace
for the new flag's name @khrm ?
Thanks for the pointer.
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.7% | 1.0 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.7% | 1.0 |
ok @khrm new flag to enable the namespace label on the throttle metric is added ... PTAL
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.7% | 1.0 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 85.7% | 1.0 |
Thanks @gabemontero /lgtm
likewise thanks @afrittoli
question to you and @khrm - any suggestion on who I should reach out to in order to get the approve label (minimally assuming @khrm you are fine with my updates from your comments) ?
Yes, this is fine.
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 86.1% | 1.3 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 86.1% | 1.3 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 86.1% | 1.3 |
The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report
| File | Old Coverage | New Coverage | Delta |
|---|---|---|---|
| pkg/apis/config/metrics.go | 73.7% | 76.2% | 2.5 |
| pkg/taskrunmetrics/metrics.go | 84.8% | 86.1% | 1.3 |
unrelated flake on alpha integration
/test pull-tekton-pipeline-alpha-integration-tests
@afrittoli @khrm PTAL at the updates from @khrm 's last review - thanks
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: afrittoli, khrm
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [afrittoli]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/lgtm