bottlerocket-update-operator
bottlerocket-update-operator copied to clipboard
0.2.0: Cleanup BRS when the operator is removed from a node
Issue or Feature Request:
BRSs currently have ownerReferences
to the k8s Node
object that the BRS is associated with, meaning that if the Node is deleted, the BRS will be deleted with it. This is great, because it allows the controller to remove the BRS from its ActiveSet.
I foresee an issue though: If a customer removes the brupop
label from their node in the middle of an update, the brupop agent
/daemonset will be removed from the host, causing updates to cease; however, the BRS in question will not be deleted until the corresponding Node is deleted.
We should create a strategy by which BRSs are cleaned up when the daemonset is removed, or allow the controller to delete BRSs which timeout and seem to become endlessly stuck.
I've added some text about this case to the README for now, so there is at least documentation for assisting customers to not become stuck by this. However, having an automated solution here would be idea.
If the controller keeps a reflector
of Nodes, it won't have to do an additional API call to check for label existance.