system-upgrade-controller icon indicating copy to clipboard operation
system-upgrade-controller copied to clipboard

Proposal feature: Adding label to node after applying the job

Open Suse-KevinKlinger opened this issue 2 years ago • 3 comments

This is supposed to be an proposal for a new feature. It will allow you to add a label to a node you update via your Plans.

Idea: By this you could label nodes with something like a patch level to distinguish between already updated nodes. Why would one need this? Imagine you want to add nodes to your cluster and keep all nodes on the same patch level. You could then setup Plans which exclude already patched nodes via the nodeSelector and apply the updates only to the newly joined nodes.

Suse-KevinKlinger avatar Aug 19 '21 16:08 Suse-KevinKlinger

Idea: By this you could label nodes with something like a patch level to distinguish between already updated nodes. Why would one need this? Imagine you want to add nodes to your cluster and keep all nodes on the same patch level. You could then setup Plans which exclude already patched nodes via the nodeSelector and apply the updates only to the newly joined nodes.

@Suse-KevinKlinger thank you for starting this conversation with your contribution!

I think that SUC already satisfies this example use-case:

My working assumption for the example use-case that you described is that the "patch level" would be codified in the plan's resolved version. So, something like v1.2.3 :arrow_right: v1.2.4 or possibly 15sp2 :arrow_right: 15sp3 (for SLES). Such is already digested in constructing the "latest version" hash, which SUC labels a node with on successful application of an upgrade job. When selecting candidate nodes for which upgrade jobs will be generated, informed primarily by the plan's node selector, SUC does also exclude upgraded nodes via this label, e.g. (plan.upgrade.cattle.io/${plan.name} != ${plan.status.latestVersion}. In other words, if the underlying version changes then the candidate pool will automatically grow to any node that is not labeled with plan.upgrade.cattle.io/${plan.name} = ${plan.status.latestVersion} (assuming that it also meets the plan's nodeSelector criteria).

Happy to entertain this further if I am missing the importance of the change for you!

dweomer avatar Aug 20 '21 20:08 dweomer

@dweomer thank you for explanations. I have to admin that I was not aware, that the SUC is already capable of handling the given scenario.

But after thinking a while, I'd say labeling nodes after they've been patched may still be helpfull in terms of viewing the state quite easily. Users then could see the "patch-level" of their nodes by checking their labels.

I'd be happy to have your opinion on this :)

Suse-KevinKlinger avatar Aug 27 '21 08:08 Suse-KevinKlinger

@dweomer thank you for explanations. I have to admin that I was not aware, that the SUC is already capable of handling the given scenario.

But after thinking a while, I'd say labeling nodes after they've been patched may still be helpfull in terms of viewing the state quite easily. Users then could see the "patch-level" of their nodes by checking their labels.

I'd be happy to have your opinion on this :)

For your specific use-case I think you would want to rely on what the kubelet is reporting, no? e.g.

nuc-1 [~]$ kubectl get node -o wide
NAME    STATUS   ROLES                       AGE    VERSION        INTERNAL-IP    EXTERNAL-IP   OS-IMAGE              KERNEL-VERSION     CONTAINER-RUNTIME
nuc-1   Ready    control-plane,etcd,master   230d   v1.20.7+k3s1   192.168.1.31   <none>        k3OS v0.20.7-k3s1r0   5.4.0-73-generic   containerd://1.4.4-k3s1
nuc-2   Ready    control-plane,etcd,master   230d   v1.20.7+k3s1   192.168.1.32   <none>        k3OS v0.20.7-k3s1r0   5.4.0-73-generic   containerd://1.4.4-k3s1
nuc-3   Ready    control-plane,etcd,master   230d   v1.20.7+k3s1   192.168.1.33   <none>        k3OS v0.20.7-k3s1r0   5.4.0-73-generic   containerd://1.4.4-k3s1

(with similar detail exposed in the rancher ui)

That said, I can imagine value in an ad-hoc label applied to a node after an upgrade has completed successfully. I haven't given much thought of implementing this because one can always choose to label a node as part of an upgrade job (assuming a k8s client is shipped in the container) although such would be applied right before successful completion as opposed to after.

Hmm, I want to think about this some more.

dweomer avatar Aug 27 '21 08:08 dweomer