kube-state-metrics Simplify custom resource metrics API by leveraging jq/CEL

What would you like to be added:

Doc: https://github.com/kubernetes/kube-state-metrics/pull/2059

A simplified API for CustomResourceStateMetrics, which only supports values and labels, instead of supporting each, path, labelFromKey, labelsFromPath, valueFrom, commonLabels, labelsFromPath and *.

# new 
kind: CustomResourceStateMetricsV2
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "ready_count"
          help: "Number Foo Bars ready"
          values: jq '[.status.sub[].ready]' # valueFrom: [ready] // [2,4]
          labels:
          - jq '[ .status.sub | keys | .[] | {name: .}]' # labelFromKey: type // [{"name": "type-a"}, {"name": "type-b"}]
          - jq '[{ custom_metric:"yes" }]' # custom_metric: "yes" // [{custom_metric="yes"}]
          - jq '[.metadata.labels]' # "*": [metadata, labels] // [{"bar": "baz","qux": "quxx"}]
          - jq '[.metadata.annotations]' # "**": [metadata, annotations] // [{"foo": "bar"}]
          - jq '[{ name: .metadata.name }]' # name: [metadata, name] // [{"name": "foo"}]
          - jq '[{ foo: .metadata.labels.foo }]' # foo: [metadata, labels, foo] // [{foo": "bar"}]
          - jq '[.status.sub[].active | {active: .}]' # labelsFromPath:  active: [active] // [{active": 1}, {"active": 3}]
          
 # old
      metrics:
        - name: "status_phase"
          help: "Foo status_phase"
          each:
            type: StateSet
            stateSet:
              labelName: phase
              path: [status, phase]
              list: [Pending, Bar, Baz]

kube_customresource_ready_count{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", active="1",custom_metric="yes",foo="bar",name="foo",bar="baz",qux="quxx",type="type-a"} 2
kube_customresource_ready_count{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", active="3",custom_metric="yes",foo="bar",name="foo",bar="baz",qux="quxx",type="type-b"} 4

Why is this needed:

simpler and easier for KSM community to maintain. Have seen several issues around corner cases with custom resource metrics (https://github.com/kubernetes/kube-state-metrics/issues/1992).
easier for users to use and debug custom resource metrics

Describe the solution you'd like

Additional context Han recommended cel

https://kubernetes.io/docs/reference/using-api/cel/
https://github.com/google/cel-go

Feb 07 '23 03:02 CatherineF-dev

I like the idea of cleaning up the configuration. When doing so we should take care that we are still able to address all use-cases which got addressed currently.

IMHO: if we introduce a new version for the configuration, we should do it in a way to have auto-conversion from the old configuration by using a custom config type as in https://book.kubebuilder.io/component-config-tutorial/config-type.html .

Also related issue: #1948 .

Feb 07 '23 11:02 chrischdi

/triage accepted /assign @CatherineF-dev

Feb 09 '23 17:02 logicalhan

@CatherineF-dev could you perhaps start a design doc highlighting the different options we have to improve the UX of the existing API?

Feb 09 '23 18:02 dgrisonnet

Okay!

Feb 09 '23 18:02 CatherineF-dev

Verified that can convert k8s objects into yamls https://github.com/kubernetes/kube-state-metrics/compare/main...CatherineF-dev:kube-state-metrics:cr-metrics-2?expand=1.

Apr 13 '23 13:04 CatherineF-dev

Existing problems for KSM custom resource

1. Custom resource API is complicated and not flexible

Now, it supports 7 operations: each, path, labelFromKey, labelsFromPath, valueFrom, commonLabels and *.

It’s not easy to use and brings some corner case issues.

I am using this custom resource document , but I couldn't find a clear yes or no there. I want to capture replsets. <all-the-elements>. size. Is this possible?
"LabelFromKey" not available #1868
Crash on nonexistent metric paths in custom resources #1992
Can’t aggregate metrics for multiple CRs. For example, It can’t answer “How many CRs under one CRD”?

Existing proposals: - Pr 2014 proposes a metric generation tool to create configurations from monitored CRD. - Issue 1978 proposes to simplify the API from 7 operations to 2 operations using jq (JsonPath). It can support some aggregations to answer question “How many CRs under one CRD”?

2. Coupled monitoring pipeline and monitoring target

Need to modify kube-state-metrics agents if you want to monitor one custom resource.

--custom-resource-state-config "inline yaml (see example)" 
--custom-resource-state-config-file /path/to/config.yaml

Existing proposals: Issue 1948 proposes to support CustomResourceDefinition CRD.

Proposal

Supports Issue 1948 proposes to support CustomResourceDefinition CRD. Issue 1978 proposes to simply API from 7 operations to 2 operations.

Apr 13 '23 14:04 CatherineF-dev

cc @dgrisonnet,

What else do I need to add into https://github.com/kubernetes/kube-state-metrics/issues/1978#issuecomment-1507076370? Thx!

Apr 13 '23 14:04 CatherineF-dev

Feel free to open a PR adding your design doc under docs/design and we can review it from there

Apr 13 '23 15:04 dgrisonnet

Would this be capable of parsing annotation values as json? In some cases, we have controllers that save state on an object as json annotations. It would be nice to expose fields within those json objects.

Apr 19 '23 23:04 nathanperkins

One point which came to my mind we should consider if this gets done: performance!

Apr 25 '23 09:04 chrischdi

I feel like we should use CEL since that's the direction that Kubernetes is moving.

Apr 25 '23 20:04 logicalhan

Reply @nathanperkins, I think CEL can parse annotation values as json. So it's feasible.

Design doc is here: Simplify custom resource state metrics API using CEL(https://github.com/kubernetes/kube-state-metrics/pull/2059)

Also, it can support counting the number of CRs under one CRD. Or anything else which can be queried using CEL.

May 08 '23 14:05 CatherineF-dev

I find we can reuse some codes from Custom Resource Field Selectors

https://github.com/kubernetes/enhancements/blob/b3f29fe1223ebf09858ad3289dbfe3f652dd6069/keps/sig-api-machinery/4358-custom-resource-field-selectors/README.md

Jan 16 '24 18:01 CatherineF-dev

kube-state-metrics kube-state-metrics copied to clipboard

Simplify custom resource metrics API by leveraging jq/CEL

Existing problems for KSM custom resource

1. Custom resource API is complicated and not flexible

2. Coupled monitoring pipeline and monitoring target

Proposal

kube-state-metrics
kube-state-metrics copied to clipboard