kubernetes-nmstate
kubernetes-nmstate copied to clipboard
handler: Expose nmstatectl stats as k8s metrics
Is this a BUG FIX or a FEATURE ?: /kind enhancement
What this PR does / why we need it: Now that nmstatectl is able to calculate some useful stats from network configuration [1], we can bubble them up and expose them as k8s metrics so k-nmstate users can digg on them using prometheus, graphana or the like.
This is an example of nmstate feature stat
kubernetes_nmstate_features_applied{name="dhcpv4-custom-hostname"} 1
Depends on nmstate 2.2.20, looks like it's build but still not present at centos 9 stream
- https://kojihub.stream.rdu2.redhat.com/koji/buildinfo?buildID=41534
[1] https://github.com/nmstate/nmstate/pull/2420
TODO:
- [ ] Compare old and new nncp stats to be able to decrease conunters.
Release note:
Expoxe statistics generated from `nmstatectl stats`
@machadovilaca @avlitman please see if you can help in reviewing this PR.
@avlitman @machadovilaca @sradco can you take another look ?
@machadovilaca: changing LGTM is restricted to collaborators
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@qinqon I propose you to also add docs generator for the operator. I think this is really useful.
We have the automation, so that when the user adds a PR with a new metric, the test runs and checks if the metric is already documented. If not the user is asked to run make generate and this automatically updated the PR with the change to the metrics.md file with the new metric, description and type.
See an example to the metrics.md file here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/metrics.md and the docs generator is here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/tools/metricsdocs/metricsdocs.go (Note: we plan to move it to /monitoring/tools/ )
@qinqon I propose you to also add docs generator for the operator. I think this is really useful.
We have the automation, so that when the user adds a PR with a new metric, the test runs and checks if the metric is already documented. If not the user is asked to run make generate and this automatically updated the PR with the change to the metrics.md file with the new metric, description and type.
See an example to the metrics.md file here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/metrics.md and the docs generator is here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/tools/metricsdocs/metricsdocs.go (Note: we plan to move it to /monitoring/tools/ )
@sradco introducing it, add a lot of golang dependencies to the project, I am not sure about it, maybe we can do this at follow up.
@sradco @machadovilaca I have convert the Counter to Guague and decrease if topology/feature is no longer in use, can you take another look to see if everything is ok from a monitoring perspective ?
We are going to steak with features for now since it has a limited bounds, we will investigate options for topology
@qinqon Can you please add an example to the PR description of the end metric with labels?
/retest
/retest
Trying to pull registry.access.redhat.com/ubi9/ubi-minimal:latest...
Error: creating build container: copying system image from manifest list: determining manifest MIME type for docker://registry.access.redhat.com/ubi9/ubi-minimal:latest: reading manifest sha256:119ac25920c8bb50c8b5fd75dcbca369bf7d1f702b82f3d39663307890f0bf26 in registry.access.redhat.com/ubi9/ubi-minimal: received unexpected HTTP status: 502 Bad Gateway
make[1]: *** [Makefile:168: push-operator] Error 125
make[1]: Lea
@sradco can you take another look ? I think I have cover all the comments.
/hold
nmstate from "base" branch is failing a metrics tests at the "future" lane.
/retest
locally with NMSTATE_PIN=future everything is fine.
/retest
/hold cancel
Now is ready
/retest
/lgtm
/approve
/retest
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: qinqon
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [qinqon]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
/retest