Manage Etcd peers using custom resources
One of the challenges for automation (Terraform, cluster API, k0sctl) when removing nodes is that when removing a controller node one has to also remove the Etcd peer entry to it in Etcd cluster.
In this PR there's few things going on:
- Each controller creates their own
EtcdMemberobject with memberID and peer address fields - Each controller watches the changes on
EtcdMemberobjects - When the leader controller sees
EtcdMemberobject being deleted it will remove the corresponding peer from Etcd cluster. - If the peer removal fails for some reason it will write an event into
kube-systemnamespace
Very DRAFT still, need to refactor some of the code for better readability and error handling.
Description
Fixes # (issue)
Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation update
How Has This Been Tested?
- [ ] Manual test
- [ ] Auto test added
Checklist:
- [ ] My code follows the style guidelines of this project
- [ ] My commit messages are signed-off
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream modules
- [ ] I have checked my code and corrected any misspellings
This pull request has merge conflicts that need to be resolved.
This pull request has merge conflicts that need to be resolved.
two questions:
- Will "left over" etcd members (e.g. from past-controller changes, happened to me once after changing node IPs) be cleaned up?
- How would controller removal look like overall? especially when the node died without cleaning up after itself.
Will "left over" etcd members (e.g. from past-controller changes, happened to me once after changing node IPs) be cleaned up?
Not sure I get what you mean by "left over" member here? Do you mean you had a stale member in etcd cluster, i.e. a node with the old IP / hostname? If yes, then those can be also cleaned up by marking the node to leave the etcd cluster.
How would controller removal look like overall? especially when the node died without cleaning up after itself.
Using EtcdMemberobject you'd remove a node by running something like:
kubectl patch etcdmember controller2 -p '{"leave":true}' --type merge
That triggers the current leader controller to remove the given member from etcd. Of course now if the node died without cleaning up youäd still have leftover data etc on the node.
Not sure I get what you mean by "left over" member here? Do you mean you had a stale member in etcd cluster, i.e. a node with the old IP / hostname?
yes exactly. so the "stale" members would also just show up as EtcdMember objects and I can then just delete them, is that right?
Regarding the controller removal: I was more interested in whether there would be any automation like e.g. a finalizer on Node objects, that makes sure that corresponding etcd member is deleted as well. So would "delete the etcdmember object" be an additional step in the process of removing a controller node or would it happen automatically under normal circumstances.
automation like e.g. a finalizer on Node objects
We cannot use Node object for this purpose, for plain controllers there's no Node object in k0s.
So would "delete the etcdmember object" be an additional step in the process of removing a controller node or would it happen automatically under normal circumstances.
Once the member has left etcd cluster and the object status reflects that, you can safely remove the object. k0s will NOT do that automatically as we wanted to have some visibility on the actions already taken. My initial idea was actually to rely on object deletion but that does not keep any visible history what has happened.
Also in case you remove say controller5 member and later re-join that same node (same hostname) k0s will be able to re-join it in etcd and the object status will be updated.
I converted this to draft accidentally and made it ready to review again, disregard my updates
Currently, an EtcdMember custom resource looks like the following:
apiVersion: etcd.k0sproject.io/v1beta1
kind: EtcdMember
memberID: 21e943ed8772de19
metadata:
creationTimestamp: "2024-05-03T11:32:11Z"
generation: 1
name: etcdmember-controller-2
resourceVersion: "362"
uid: 641f2594-881a-4e8f-939e-94535d6b5904
peerAddress: 172.27.200.93
status:
conditions:
- status: "True"
type: Joined
Wouldn't it be better to not have something else than apiVersion/kind/metadata/spec/status as top-level fields?
So what about this as an alternative:
--- etcdmember-controller-2
+++ etcdmember-controller-2
@@ -1,14 +1,16 @@
apiVersion: etcd.k0sproject.io/v1beta1
kind: EtcdMember
-memberID: 21e943ed8772de19
metadata:
creationTimestamp: "2024-05-03T11:32:11Z"
generation: 1
name: etcdmember-controller-2
resourceVersion: "362"
uid: 641f2594-881a-4e8f-939e-94535d6b5904
-peerAddress: 172.27.200.93
+spec:
+ leave: false
status:
conditions:
- status: "True"
type: Joined
+ memberID: 21e943ed8772de19
+ peerAddress: 172.27.200.93
Moreover, the lastTransitionTime is not present on the condition. There's also lastProbeTime on stock Kubernetes conditions.