kube-state-metrics
kube-state-metrics copied to clipboard
Automate generation of Custom Resource configuration
What would you like to be added:
As a developer, I would like the kube-state-metrics Custom Resource configuration to be automatically generated from the API code.
Thanks to @sbueringer and @fabriziopandini for helping in driving this efforts 👍
Why is this needed:
kube-state-metrics requires a configuration file for Custom Resources which could grow to a very large file and may be hard to maintain over long-term together with changing Custom Resource definitions.
Instead of writing the configuration file manually it would be awesome if the configuration could get generated from the code where the custom resource gets defined instead.
Describe the solution you'd like
Prior art:
Kubebuilder makes use of a tool called controller-gen to generate the yaml's for e.g. Custom Resource definitions from code.
To do that it makes heavy use of markers inside the go code, which are comments in the form of // +foo:bar:key1=value1,key2=value2.
Similar to controller-gen, kube-state-metrics could provide an additional tool to generate the custom resource metrics configuration file from pre-defined markers at the go code level.
Additional context
We already thought about adding a tool like this at the Cluster API project, but we think it would be a better fit for the kube-state-metrics project, as it could help authors of any custom resource and is not specific to Cluster API.
We also have an idea how the markers / design could look like, which builds on top of the currently manually written configuration in Cluster API: https://github.com/kubernetes-sigs/cluster-api/issues/7158#issuecomment-1317701277 . I could also migrate the first implementation design idea over to this issue or however it would be decided to proceed.
If the maintainers of kube-state-metrics also think that the described idea would be worth to implement: I'd be of course happy to volunteer to help or contribute for this efforts.
Small example:
// +Metrics:namePrefix=capi_cluster
// +Metrics:labelFromPath:name=name,JSONPath=".metadata.name"
// +Metrics:labelFromPath:name=namespace,JSONPath=".metadata.namespace"
// +Metrics:labelFromPath:name=uid,JSONPath=".metadata.uid"
type Cluster struct {
metav1.ObjectMeta `json:"metadata,omitempty"`
...
Spec ClusterSpec `json:"spec"`
}
type ClusterSpec struct {
// +Metrics:gauge:name="spec_paused",nilIsZero=true,help="Whether the cluster is paused and any of its resources will not be processed by the controllers."
Paused bool `json:"paused,omitempty"`
}
Could result in the following configuration file
kind: CustomResourceStateMetrics
spec:
resources:
- groupVersionKind:
group: cluster.x-k8s.io
kind: Cluster
version: v1beta1
labelsFromPath:
name:
- metadata
- name
namespace:
- metadata
- namespace
uid:
- metadata
- uid
metricNamePrefix: capi_cluster
metrics:
- name: spec_paused
help: Whether the cluster is paused and any of its resources will not be processed by the controllers.
each:
gauge:
nilIsZero: true
path:
- spec
- paused
type: Gauge
This is a good idea to generate configurations.
However, does it require changing codes to add these annotations? Another idea might be generating configurations using k8s client.
However, does it require changing codes to add these annotations?
Yes. This would follow the pattern that controller-runtime/controller-tools uses to generate CRDs (see: https://book.kubebuilder.io/reference/markers/crd.html). Essentially we would add markers for metrics in addition to the ones we already have for the CRD generation.
Another idea might be generating configurations using k8s client.
How would this work? (which k8s client do you mean?)
However, does it require changing codes to add these annotations?
This will be an opt-in method alternative to writing this file manually; and as described above, it follow a well established pattern in Kubernetes API development (some of those annotations originate from Kuberentes API itself, kubebuilder added more on top)
This will be an opt-in method alternative to writing this file manually.
Got it. Maybe we can keep kube-state-metrics API (crd-config.yaml) as same before, and this tool helps generating/merging CRD configurations?
The tricky cases I considered before are:
- want annotation-based CRD metrics but it's hard to change codes
- don't want annotation-based CRD metrics but it's hard to change codes
For kuberenetes/kubernetes codes, it needs rebuilding binary after changing codes. For other oss components, sometimes we just use it and want to monitor it without changing codes.
Maybe we can keep kube-state-metrics API (crd-config.yaml) as same before, and this tool helps generating/merging CRD configurations?
Yup absolutely, that is the idea :)
I think for the cases where the code cannot be adjusted it's probably the easiest to write the metrics configuration manually.
What we try to solve is essentially that when you are in control of the CRDs and corresponding Go types you can directly mark the metrics on the fields and then you don't have to write the config. But I think if you can't modify the CRD's go types it's hard to find an easier way to get the the config then just writing the config manually.
essentially that when you are in control of the CRDs and corresponding Go types you can directly mark the metrics on the fields and then you don't have to write the config.
Agree.
Once this tool becomes popular in the future, we need to consider the case that don't want annotation-based CRD metrics but it's hard to change codes. For examples, a lot of OSS CRDs have these annotations.
kube-state-metrics should have the ability to control which metric is collected, instead of basing on CRD annotations.
agreed this is a tool and it just increases the number of available options for kube-state-metrics users 1 - manually write crd-config.yaml (as of today), pass it to kube-state-metrics 2 - annotate CRD, use the tool to generate crd-config.yaml, pass it to kube-state-metrics 3 - annotate CRD, use the tool to generate crd-config.yaml + manually make some adjustement, pass it to kube-state-metrics 4 - probably more
but from a kube-state-metrics nothing will change, everything will start from one (or potentially more) crd-config.yaml
We would like to use kube-state-metrics with crossplane where we create dozens of CRD's
It would be great if kube-state-metrics could provide CustomResourceStateMetrics CRD. Whenever object of that kind will be created it will generate new crd-cofig.yaml and reload metrics.
With such CRD we can define for our every crossplane component CustomResourceStateMetrics configuration saying how to produce metrics
I would also love to have a CRD for defining the configuration. However IMHO this should be seen as a separate topic and I'd prefer to use a separate issue to track this. This kind of feature was already mentioned multiple times in slack too 🙂
We should keep the issue here scoped to "generating the input" for kube-state-metrics.
If the "read configuration from CRD" feature exists: the here proposed generator should get adjusted/improved to (also?) allow creating the CR's.
Edit: I will go forward and create a separate issue for this :-) Edit2: link to seperate issue:
- https://github.com/kubernetes/kube-state-metrics/issues/1948
I'd like to propose the following UX for the generator (kudo's to @sbueringer and @fabriziopandini which helped brainstorming and compiling this).
We did take the metrics at Cluster API and went through the example metrics to try to catch all use-cases.
Note:
- all markers must have a prefix which is still TBD, we’re using
Metricsfor now (it could also be something likeksmorkube-state-metrics) - all markers must comply with https://book.kubebuilder.io/reference/markers.html#marker-syntax
Metrics:namePrefix
// +Metrics:namePrefix=<string> on API type struct
Defines the metricNamePrefix for all metrics derived from the struct the markers apply to.
e.g.
// +Metrics:namePrefix=capi_cluster
type Cluster struct { ... }
Metrics:labelFromPath
// +Metrics:labelFromPath:name=<string>,JSONPath=<string> on API type struct
Defines a label that applies to all metrics derived from the struct the markers apply to.
e.g.
// +Metrics:labelFromPath:name=name,JSONPath=".metadata.name"
// +Metrics:labelFromPath:name=namespace,JSONPath=".metadata.namespace"
// +Metrics:labelFromPath:name=uid,JSONPath=".metadata.uid"
type Cluster struct { ... }
Metrics:gauge
// +Metrics:gauge:name=<string>,help=<string>,nilIsZero=<bool>,JSONPath:<string>,labelFromPath={map[<string>]<string>} on field
When applied to an API field it creates a metric of type gauge for the field.
name=<string>the name of the metrichelp=<string>the help string of the metricnilIsZero=<bool>optional; force the metric to count nil values as zeroJSONPath:<string>optional; in case the field is a complex type, this allows creating metrics for nested fields given their pathlabelsFromPath={map[<string>]<string>}optional; allows adding labels whose values are read from the given path (.can be used as current path)
e.g.
// +Metrics:gauge:name="spec_paused",help="Whether the cluster is paused and any of its resources will not be processed by the controllers.",nilIsZero=true
Paused bool `json:"paused,omitempty"`
// +Metrics:gauge:name="created",JSONPath='.creationTimestamp",help="Unix creation timestamp."
metav1.ObjectMeta `json:"metadata,omitempty"`
Metrics:stateset
// + Metrics:stateset:name=<string>, help=<string>, labelName=<string>, list=[]<string>, JSONPath:<string>, labelFromPath={map[<string>]<string>} on field
When applied to an API field it creates a metric of type statest for the field.
name=<string>the name of the metrichelp=<string>the help string of the metriclabelName=<string>the name of the label for the statesetlist=[]<string>the list of values for the statesetJSONPath:<string>optional; in case the field is a complex type, this allows creating metrics for nested fields given their pathlabelsFromPath={map[<string>]<string>}optional; allows adding labels whose values are read from the given path (.can be used as current path)
e.g.
// +Metrics:stateset:name="status_phase", help="The clusters current phase.", labelName=phase, list{"Pending", "Provisioning", "Provisioned", "Deleting", "Failed", "Unknown"}
Phase string `json:"phase,omitempty"`
// +Metrics:stateset:name="status_condition", help="The condition of a cluster.", labelName="status", JSONPath: ".status", list{"True", "False", "Unknown"}, labelsFromPath={type: ".type"}
Conditions Conditions `json:"conditions,omitempty"`
Metrics:info
// +Metrics:info:name=<string>,help=<string>,JSONPath:<string>, labelFromPath={map[<string>]<string>} on field or struct
When applied to an API field it creates a metric of type info for the field.
name=<string>the name of the metrichelp=<string>the help string of the metricJSONPath:<string>optional when the marker is applied to a field, required when the marker applies to a struct; this allows creating metrics for nested fields given their pathlabelsFromPath={map[<string>]<string>}optional; allows adding labels whose values are read from the given path (.can be used as current path)
e.g.
// +Metrics:info:name="info",help="Information about a cluster.",labelsFromPath={topology_version: ".spec.topology.version", topology_class: ".spec.topology.class"}
// Cluster is the Schema for the clusters API.
type Cluster struct {
// +Metrics:info:name="annotation_paused", JSONPath='.annotations.['cluster\\.x-k8s.io/paused']", help="Whether the cluster is paused and any of its resources will not be processed by the controllers.", labelsFromPath={paused_value: "."}
metav1.ObjectMeta `json:"metadata,omitempty"`
}
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten AFAIK there is a PR for this
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale