AKS icon indicating copy to clipboard operation
AKS copied to clipboard

[Feature] Support the label node-role.kubernetes.io/CUSTOM to allow standard tools like K9s kube-stat-metrics to correctly assign the node role

Open abossard opened this issue 1 year ago • 14 comments

Related issue in the Kube State Metrics project: https://github.com/kubernetes/kube-state-metrics/issues/2474

Is your feature request related to a problem? Please describe. Kube State Metrics relies on node-role.kubernetes.io labels to understand what role a node belongs to. These are official labels from the K8s project. Monitoring tools that rely on kube state metrics won't populate the node role attributes due to the lack of that. Open Shift and other platforms allow to set your custom node-roles as well, which nicely integrates them into all tools that base their role on kube state metrics.

E.g. node with GPU can have: node-role.kubernetes.io/GPU=true

It's preferable to have these standardized labels, since it makes it easier to migrate from and to different K8s providers.

Also Kube State Metrics doesn't allow customization on where to get the node role from.

Describe the solution you'd like I would like that AKS allows setting custom node-role.kubernetes.io/* labels, als long as they don't interfere with existing ones.

Describe alternatives you've considered An alternative would be that Kube State Metrics gets changed so that e.g. it would allow getting the node-role labels from a different label namespace. But since the node-role labels are well documented with K8s, I think it's preferrable if AKS adds the labels.

Additional context See the attached issue for some more details https://github.com/kubernetes/kube-state-metrics/issues/2474

abossard avatar Nov 06 '24 08:11 abossard

@abossard any news?

R-Studio avatar Dec 10 '24 07:12 R-Studio

It's ridiculous that in AKS I can have multiple pools, but cannot assign this label. Not to say that such message is totally confusing, as the prefix is not even mentioned The kubernetes.io/ and k8s.io/ prefixes are reserved for Kubernetes core components. and is thrown even if kubernetes.io is somewhere in the label key (someone is confusing contains with startsWith).

voyvodov avatar Jan 09 '25 08:01 voyvodov

We created the Microsoft case in August 2024, and this issue has been open for more than 3 months and we have not yet received any feedback on how we can resolve it or when we can expect a solution.

R-Studio avatar Jan 27 '25 13:01 R-Studio

Hi all, we have added the support for customer setting/unsetting labels with the prefix node-role.kubernetes.io/*. Please give it a try and let us know if you run into any issues. Thanks!

Xinyue-Wang avatar Mar 14 '25 20:03 Xinyue-Wang

Thanks you! I'll give it a try. Cheers

abossard avatar Mar 14 '25 21:03 abossard

@Xinyue-Wang & @abossard unfortunately it does not work in the Azure Web-UI: The kubernetes.io/ and k8s.io/ prefixes are reserved for Kubernetes core components.

R-Studio avatar Mar 18 '25 15:03 R-Studio

@R-Studio This is by design not supported by following kubernetes official document. Please check first sentence in https://kubernetes.io/docs/reference/labels-annotations-taints/

Kubernetes reserves all labels, annotations and taints in the kubernetes.io and k8s.io namespaces.

This issue is speficially for unblock customer setting label node-role.kubernetes.io/

Xinyue-Wang avatar Mar 18 '25 20:03 Xinyue-Wang

Also a kind reminder, please don't apply label node-role.kubernetes.io/ though AKS API, which may cause new node with this label fail to join the cluster. This label can only be applied after node join the cluster. Kubernetes does not allow kubelet to assign node-role.kubernetes.io/* labels, so nodes cannot self-identify.

Xinyue-Wang avatar Mar 18 '25 20:03 Xinyue-Wang

@Xinyue-Wang still doesn't work either through UI or AKS API.

Also, regarding the statement Kubernetes reserves all labels, annotations and taints in the kubernetes.io and k8s.io namespaces. this doesn't mean they cannot be used. On the same page they clearly states that

This optional label is applied to a node when you want to mark a node role. The node role (text following / in the label key) can be set, as long as the overall key follows the [syntax](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set) rules for object labels.

So, there are no restrictions of using this one at least.

voyvodov avatar Mar 19 '25 06:03 voyvodov

Also a kind reminder, please don't apply label node-role.kubernetes.io/ though AKS API, which may cause new node with this label fail to join the cluster. This label can only be applied after node join the cluster. Kubernetes does not allow kubelet to assign node-role.kubernetes.io/* labels, so nodes cannot self-identify.

So if cannot set it through the AKS API, how should we automate this?

R-Studio avatar Mar 19 '25 07:03 R-Studio

@voyvodov There is no restriction for user to apply label node-role.kubernetes.io/ through kubectl or kubeclient.

@R-Studio I would suggest use controller component to apply labels through kubectl or kubeclient after nodes come up. An example https://github.com/vlasov-y/node-role-labeler/tree/main

The request to apply set role via AKS is noted and under discussion

Xinyue-Wang avatar Mar 19 '25 08:03 Xinyue-Wang

@Xinyue-Wang the problem is not so much with kubectl or kubeclient, but to do so via AKS API/UI.

The other topic that I cannot agree is the apply through API as this will affect the bootstrap as AKS doesn't set user defined labels via --node-labels=. It's done via controller, which is happening after bootstrap and join to the cluster.

So, I don't see a real reason not to be able to set node-role.kubernetes.io/ via the API and that to be applied by AKS for the whole pool.

voyvodov avatar Mar 19 '25 09:03 voyvodov

@voyvodov AKS does not add/update label via controller. Label/taint feature of AKS guarantee that labels/taints are there at the same time as node join the cluster.

  1. If you create the nodepool with this label ffrom AKS API, new nodes won't be able join the cluster
  2. If you update exisiting nodepool with this label from AKS API, existing nodes could be updated with this label without problem. But if the nodepool ever get scale up or upgrade, new nodes/nodes after upgrade won't be able to join the cluster. Hence we block setting this label through AKS with current feature.

The request to apply set role via AKS is noted and under discussion. We are looking into solution from upstream.

Xinyue-Wang avatar Mar 19 '25 17:03 Xinyue-Wang

@Xinyue-Wang The request to apply set role via AKS is noted and under discussion. -> When can we expect an answer?

R-Studio avatar Mar 24 '25 15:03 R-Studio

@Xinyue-Wang any updates regarding this issue? IMO this seems to be a common and legitimate use-case, that shouldn't pose any technical difficulties.

lucatr avatar May 06 '25 18:05 lucatr

We are exploring upstream approach to support this apply node-role.kubernetes.io/* label through AKS API. Will update if possible.

Meanwhile there is no restriction for user to apply node-role.kubernetes.io/* label on their own via kubectl or other kube client. As i explained earlier, this label can only be applied after node join the cluster. Kubernetes does not allow kubelet to assign node-role.kubernetes.io/* labels, so nodes cannot self-identify.

Xinyue-Wang avatar May 06 '25 18:05 Xinyue-Wang

@Xinyue-Wang this issue is not fixed at all. We're still not able to set this label during provisioning of the node-group. This is a basic requirement. Setting it on the node afterwards using kubectl is no option, since this doesn't work with e.g. node-autoscaler or general automation. Therefore I suggest to open this issue again.

Workaround: We were able to implement a workaround using Kyverno.

kind: ClusterPolicy
metadata:
  name: label-node-role
  annotations:
    policies.kyverno.io/title: Label Nodes with System Role
    policies.kyverno.io/category: Other
    policies.kyverno.io/severity: medium
    policies.kyverno.io/subject: Node, Label
    policies.kyverno.io/description: >-
      This policy labels AKS nodes with the system or user role label, depending on the node's mode.
      This is used to determine which nodes are part of which role.
spec:
  rules:
    - name: label-node-system
      skipBackgroundRequests: true
      match:
        any:
        - resources:
            kinds:
            - Node
      preconditions:
        all:
        - key: "{{`{{request.object.metadata.labels.\"kubernetes.azure.com/mode\"}}`}}"
          operator: Equals
          value: "system"
      mutate:
        mutateExistingOnPolicyUpdate: true
        targets:
        - apiVersion: v1
          kind: Node
          name: "{{`{{ request.object.metadata.name }}`}}"
        patchStrategicMerge:
          metadata:
            labels:
              node-role.kubernetes.io/system: ""
    - name: label-node-user
      skipBackgroundRequests: false
      match:
        any:
        - resources:
            kinds:
            - Node
      preconditions:
        all:
        - key: "{{`{{request.object.metadata.labels.\"kubernetes.azure.com/mode\"}}`}}"
          operator: Equals
          value: "user"
      mutate:
        mutateExistingOnPolicyUpdate: true
        targets:
        - apiVersion: v1
          kind: Node
          name: "{{`{{ request.object.metadata.name }}`}}"
        patchStrategicMerge:
          metadata:
            labels:
              node-role.kubernetes.io/user: ""
  admission: true
  background: true
  emitWarning: false```

labs7 avatar Jun 06 '25 07:06 labs7

Sometimes AKS is a joke

R-Studio avatar Aug 28 '25 06:08 R-Studio

@labs7 @R-Studio having nodes register with a label is exactly what upstream k8s project decided to block for security reasons. KubeAdm and all installations are as such unable to bring up nodes self registering like that. https://github.com/kubernetes/kubernetes/issues/84912#issuecomment-551362981

The only thing we could do is allow everyone to now set that role/label manually. When you provide AKS labels as part of the AKS API they are passed to the kubelet and the node comes with them and AKS keeps reconciling them. These specific labels are rejected by the kubelet now. Even if we provided a new label API to add them via an API call after the cluster is provisioned, they would/could still only be added after the node and clusters is already provisioned (in the same way as if you manually added them)

palma21 avatar Aug 28 '25 18:08 palma21