container-linux-update-operator
container-linux-update-operator copied to clipboard
operator: expose metrics
update-operator is a long running Go process which supervises cluster-wide complex operations. As such it should expose metrics regarding its status, which can be scraped by Prometheus and alerted upon. Access to such endpoint should be governed by kubernetes RBAC policies.
This is a preliminary list of interesting metric:
- go runtime stats
- nodes being managed by CLUO
- nodes in
reboot-neededstate - nodes in
before-rebootstate - nodes in
after-rebootstate - optional "before" and "after" checks state