kube-state-metrics icon indicating copy to clipboard operation
kube-state-metrics copied to clipboard

Resources at the beginning of ‘kubernetes.io/’ are not collected

Open leason00 opened this issue 2 years ago • 12 comments

What happened:

my pod

limits:
        kubernetes.io/batch-cpu: 2k
        kubernetes.io/batch-memory: 4Gi
      requests:
         kubernetes.io/batch-cpu: "100"
         kubernetes.io/batch-memory: 819Mi

But can't be collected.

reason is ’kubernetes.io/‘ Is considered to be a native resource: https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/utils.go#L155

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • kube-state-metrics version: 2.7.0
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Other info:

leason00 avatar Dec 30 '22 09:12 leason00

/triage accepted /assign @dgrisonnet

logicalhan avatar Jan 12 '23 17:01 logicalhan

The kubernetes.io perfix is reserved to native resource names. The code that you pointed to is a code that we copied from Kubernetes directly: https://github.com/kubernetes/kubernetes/blob/4c4d4ad0a4aea4d015561ae3e7d48e8aaf609277/pkg/apis/core/v1/helper/helpers.go#L31-L46. Today it is used for things such as validation of your Pod resources: https://github.com/kubernetes/kubernetes/blob/4c4d4ad0a4aea4d015561ae3e7d48e8aaf609277/pkg/apis/core/v1/validation/validation.go#L81-L83.

Could you give a bit more information about where kubernetes.io/batch-cpu is coming from? As far as I am seeing it, it doesn't seem to be native to Kubernetes.

dgrisonnet avatar Jan 16 '23 10:01 dgrisonnet

The kubernetes.io perfix is reserved to native resource names. The code that you pointed to is a code that we copied from Kubernetes directly: https://github.com/kubernetes/kubernetes/blob/4c4d4ad0a4aea4d015561ae3e7d48e8aaf609277/pkg/apis/core/v1/helper/helpers.go#L31-L46. Today it is used for things such as validation of your Pod resources: https://github.com/kubernetes/kubernetes/blob/4c4d4ad0a4aea4d015561ae3e7d48e8aaf609277/pkg/apis/core/v1/validation/validation.go#L81-L83.

Could you give a bit more information about where kubernetes.io/batch-cpu is coming from? As far as I am seeing it, it doesn't seem to be native to Kubernetes.

This project. https://koordinator.sh/zh-Hans/docs/user-manuals/colocation-profile.

leason00 avatar Jan 16 '23 10:01 leason00

Seeing the example:

$ kubectl get pod test-pod -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations: 
    koordinator.sh/intercepted: true
  labels:
    koordinator.sh/qosClass: BE
    koordinator.sh/priority: 1000
    koordinator.sh/mutated: true
  ...
spec:
  terminationGracePeriodSeconds: 30
  priority: 5000
  priorityClassName: koord-batch
  schedulerName: koord-scheduler
  containers:
  - name: app
    image: nginx:1.15.1
    resources:
        limits:
          kubernetes.io/batch-cpu: "1000"
          kubernetes.io/batch-memory: 3456Mi
        requests:
          kubernetes.io/batch-cpu: "1000"
          kubernetes.io/batch-memory: 3456Mi

I would expect them to rename kubernetes.io/batch-cpu to koordinator.sh/batch-cpu since it is not a native Kubernetes resource.

I don't know how these resources are getting validated today in Kubernetes since based on the link I shared above, the resource should be rejected because it doesn't follow extended resource name standard. Maybe it just skipped the validation upon creation? Either way since this is not standard for Kubernetes, I don't think we should support it.

dgrisonnet avatar Jan 16 '23 12:01 dgrisonnet

Hi @dgrisonnet Regardless of the specific resource name kubernetes.io/batch-cpu, does kube-state-metrics support native resources starting with the kubernetes.io prefix?

eahydra avatar Jan 16 '23 12:01 eahydra

Hi, we don't support any as of right now since they are excluded from the defaulting logic.

I don't know the specifics behind that choice, but I would expect the original reasoning to be that since native resources are known ahead of time, we can apply the correct unit to the metric instead of just defaulting to an integer. That said I don't think the kubernetes.io prefix is being used by any native resources today so I would be inclined to just include it in the defaulting mechanism.

dgrisonnet avatar Jan 19 '23 09:01 dgrisonnet

Hi, we don't support any as of right now since they are excluded from the defaulting logic.

I don't know the specifics behind that choice, but I would expect the original reasoning to be that since native resources are known ahead of time, we can apply the correct unit to the metric instead of just defaulting to an integer. That said I don't think the kubernetes.io prefix is being used by any native resources today so I would be inclined to just include it in the defaulting mechanism.

we are expecting "include it in the defaulting mechanism."

fengyehong avatar Feb 08 '23 05:02 fengyehong

Adding another data point, we ran into this recently. Our infrastructure has multiple nodes defined with kubernetes.io/network-bandwidth as an allocatable resource type. However, these resources do not show up via kube_node_status_allocatable, kube_node_status_capacity, kube_pod_container_resource_limits, and kube_pod_container_resource_requests.

sblumenthal avatar Mar 15 '23 13:03 sblumenthal

we are expecting "include it in the defaulting mechanism."

Contributions are welcomed :slightly_smiling_face:

dgrisonnet avatar Mar 16 '23 12:03 dgrisonnet

we are expecting "include it in the defaulting mechanism."

Contributions are welcomed 🙂

@dgrisonnet If possible, I would like to submit a PR to support native resources.

eahydra avatar Oct 17 '23 07:10 eahydra

we are expecting "include it in the defaulting mechanism."

Contributions are welcomed 🙂

@dgrisonnet If possible, I would like to submit a PR to support native resources.

Ah..@fengyehong has submitted PR #2032.

eahydra avatar Oct 18 '23 01:10 eahydra