Ensure that a completely shutdown cluster can restart
Proposed feature
Compliant Kubernetes contains a lot of validation and mutation webhooks to keep security features working. However it can cause some slight problems for certain critical applications if they aren't running in kube-system namespace.
Example: Calico on CAPI and local clusters might become deadlocked if HNC or Gatekeeper goes down as they cannot admit Calico to start again.
Proposed alternatives
Include exclusions to administrator managed namespaces that are critical and those that doesn't need it. (Like we should not need HNC to validate things in admin namespaces as we do not use HNC stuff there.)
Additional context
No response
Definition of done
- [ ] Omit critical platform admin namespaces from webhooks
- [ ] Omit platform admin namespaces from webhooks that doesn't require it
What does "completely shutdown" mean here, is it like if every node goes down?
Yes, or enough so that services backing validating and mutating webhooks are down. Then those should be able to restart without issue, so others depending on them may start.