kubernetes-nmstate icon indicating copy to clipboard operation
kubernetes-nmstate copied to clipboard

handler: Expose nmstatectl stats as k8s metrics

Open qinqon opened this issue 1 year ago • 15 comments

Is this a BUG FIX or a FEATURE ?: /kind enhancement

What this PR does / why we need it: Now that nmstatectl is able to calculate some useful stats from network configuration [1], we can bubble them up and expose them as k8s metrics so k-nmstate users can digg on them using prometheus, graphana or the like.

This is an example of nmstate feature stat

kubernetes_nmstate_features_applied{name="dhcpv4-custom-hostname"} 1

Depends on nmstate 2.2.20, looks like it's build but still not present at centos 9 stream

  • https://kojihub.stream.rdu2.redhat.com/koji/buildinfo?buildID=41534

[1] https://github.com/nmstate/nmstate/pull/2420

TODO:

  • [ ] Compare old and new nncp stats to be able to decrease conunters.

Release note:

Expoxe statistics generated from `nmstatectl stats`

qinqon avatar Dec 11 '23 09:12 qinqon

@machadovilaca @avlitman please see if you can help in reviewing this PR.

sradco avatar Dec 17 '23 10:12 sradco

@avlitman @machadovilaca @sradco can you take another look ?

qinqon avatar Dec 21 '23 12:12 qinqon

@machadovilaca: changing LGTM is restricted to collaborators

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Dec 21 '23 13:12 kubevirt-bot

@qinqon I propose you to also add docs generator for the operator. I think this is really useful.

We have the automation, so that when the user adds a PR with a new metric, the test runs and checks if the metric is already documented. If not the user is asked to run make generate and this automatically updated the PR with the change to the metrics.md file with the new metric, description and type.

See an example to the metrics.md file here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/metrics.md and the docs generator is here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/tools/metricsdocs/metricsdocs.go (Note: we plan to move it to /monitoring/tools/ )

sradco avatar Dec 21 '23 14:12 sradco

@qinqon I propose you to also add docs generator for the operator. I think this is really useful.

We have the automation, so that when the user adds a PR with a new metric, the test runs and checks if the metric is already documented. If not the user is asked to run make generate and this automatically updated the PR with the change to the metrics.md file with the new metric, description and type.

See an example to the metrics.md file here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/metrics.md and the docs generator is here https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/tools/metricsdocs/metricsdocs.go (Note: we plan to move it to /monitoring/tools/ )

@sradco introducing it, add a lot of golang dependencies to the project, I am not sure about it, maybe we can do this at follow up.

qinqon avatar Jan 02 '24 13:01 qinqon

@sradco @machadovilaca I have convert the Counter to Guague and decrease if topology/feature is no longer in use, can you take another look to see if everything is ok from a monitoring perspective ?

qinqon avatar Mar 19 '24 08:03 qinqon

We are going to steak with features for now since it has a limited bounds, we will investigate options for topology

qinqon avatar Apr 03 '24 07:04 qinqon

@qinqon Can you please add an example to the PR description of the end metric with labels?

sradco avatar Apr 04 '24 06:04 sradco

/retest

qinqon avatar Apr 04 '24 09:04 qinqon

/retest

Trying to pull registry.access.redhat.com/ubi9/ubi-minimal:latest...
Error: creating build container: copying system image from manifest list: determining manifest MIME type for docker://registry.access.redhat.com/ubi9/ubi-minimal:latest: reading manifest sha256:119ac25920c8bb50c8b5fd75dcbca369bf7d1f702b82f3d39663307890f0bf26 in registry.access.redhat.com/ubi9/ubi-minimal: received unexpected HTTP status: 502 Bad Gateway
make[1]: *** [Makefile:168: push-operator] Error 125
make[1]: Lea

qinqon avatar Apr 04 '24 09:04 qinqon

@sradco can you take another look ? I think I have cover all the comments.

qinqon avatar Apr 04 '24 10:04 qinqon

/hold

nmstate from "base" branch is failing a metrics tests at the "future" lane.

qinqon avatar Apr 05 '24 05:04 qinqon

/retest

locally with NMSTATE_PIN=future everything is fine.

qinqon avatar Apr 05 '24 10:04 qinqon

/retest

qinqon avatar Apr 09 '24 09:04 qinqon

/hold cancel

Now is ready

qinqon avatar Apr 16 '24 06:04 qinqon

/retest

qinqon avatar Apr 17 '24 06:04 qinqon

/lgtm

mkowalski avatar Apr 17 '24 08:04 mkowalski

/approve

qinqon avatar Apr 17 '24 09:04 qinqon

/retest

qinqon avatar Apr 17 '24 09:04 qinqon

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qinqon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

kubevirt-bot avatar Apr 17 '24 09:04 kubevirt-bot

/retest

qinqon avatar Apr 17 '24 13:04 qinqon