cke
cke copied to clipboard
Enable control plane to maintain during reboot
What
Maintenance operation does not run while reboot operation is running.
- Endpoints are not updated if one control-plane is down.
- controller-manager and scheduler do not recover if they crash.
How
Run reboot operation on another go routine
- Modify rebootOp as follows
- after rebooting, remove
cke.cybozu.com/reboot
annotation - wait until the node is schedulable
- remove the rebooted node from the reboot queue
- after rebooting, remove
- Modify rebootUncordonOp as follows
- only uncordon for the node which is unschedulable and no cke.cybozu.com/reboot` annotation
- Register records of the reboot operation and do the operation by using a goroutine in order to avoid record registration conflicts with other operations.
- Disable sabakan integration for the certain period during a reboot operation in order to avoid the frequent master node change
Checklist
- [ ] Finish implentation of the issue
- [ ] Test all functions
- [ ] Have enough logs to trace activities
- [ ] Notify developers of necessary actions