compliantkubernetes-apps icon indicating copy to clipboard operation
compliantkubernetes-apps copied to clipboard

Ensure that a completely shutdown cluster can restart

Open aarnq opened this issue 1 year ago • 2 comments

Proposed feature

Compliant Kubernetes contains a lot of validation and mutation webhooks to keep security features working. However it can cause some slight problems for certain critical applications if they aren't running in kube-system namespace.

Example: Calico on CAPI and local clusters might become deadlocked if HNC or Gatekeeper goes down as they cannot admit Calico to start again.

Proposed alternatives

Include exclusions to administrator managed namespaces that are critical and those that doesn't need it. (Like we should not need HNC to validate things in admin namespaces as we do not use HNC stuff there.)

Additional context

No response

Definition of done

  • [ ] Omit critical platform admin namespaces from webhooks
  • [ ] Omit platform admin namespaces from webhooks that doesn't require it

aarnq avatar Jan 17 '24 14:01 aarnq

What does "completely shutdown" mean here, is it like if every node goes down?

Ajarmar avatar Feb 10 '25 13:02 Ajarmar

Yes, or enough so that services backing validating and mutating webhooks are down. Then those should be able to restart without issue, so others depending on them may start.

aarnq avatar Feb 10 '25 14:02 aarnq