cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Ensuring Control Plane Labels/Taints Persist Through Node Replacement

Open silvery1622 opened this issue 6 months ago • 3 comments

What would you like to be added (User Story)?

CAPI should ensure control plane labels and taints are reapplied to replacement nodes after node deletion

Detailed Description

Cluster control plane nodes have a set of labels and taint which are applied by kubeadm on node creation.

If afterwards a control plane node is deleted for any reasons and then recreated e.g. by restarting kubelet, it is necessary to ensure that those labels and taints are applied to the replacement node.

Anything else you would like to add?

No response

Label(s) to be applied

/kind feature One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

silvery1622 avatar May 20 '25 09:05 silvery1622

I'm wondering if we are fixing the issue in the right layer.

Wouldn't this issue happen also if you are using plain kubeadm without CAPI? if yes, probably the fix should be discussed in kubeadm first

fabriziopandini avatar May 20 '25 09:05 fabriziopandini

i don't know what is the right fix here, but if kubeadm reset is called and if a node is then deleted, the labels and taints would presumably persist in a config file on disk. so if kubeadm join is called with the config, a new node will be created with the old labels and taints.

neolit123 avatar May 20 '25 10:05 neolit123

We also have to figure out how this fits into the label & taint propagation story

sbueringer avatar Jul 25 '25 14:07 sbueringer

We need a deeper discussion about this.

There are also bootstrap error that can lead to similar problems, and it this case we should probably not try to fix issues from CAPI/KCP. But also fixing users errors after bootstrap is tricky, might be remediation is a better strategy...

/triage accepted /priority important-longterm

/help To drive the initial research work (too early for implementing something at this stage)

fabriziopandini avatar Sep 17 '25 12:09 fabriziopandini

@fabriziopandini: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

We need a deeper discussion about this.

There are also bootstrap error that can lead to similar problems, and it this case we should probably not try to fix issues from CAPI/KCP. But also fixing users errors after bootstrap is tricky, might be remediation is a better strategy...

/triage accepted /priority important-longterm

/help To drive the initial research work (too early for implementing something at this stage)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Sep 17 '25 12:09 k8s-ci-robot