node-healthcheck-operator
node-healthcheck-operator copied to clipboard
NodeHealthCheck status is not updated when remediation CR is deleted by remediator
Hi all, I'm using NHC with a custom remediator. In some cases, my Kubernetes nodes are deleted, and as the documentation says here, my remediator will delete the remediation Custom Resource. The issue is that the NHC resource still shows these old remediations on its phase, reason, and inFlightRemediations:
inFlightRemediations:
yul1-r11-u14: "2023-11-07T21:53:04Z"
yul1-r11-u15: "2023-11-07T02:49:42Z"
observedNodes: 131
phase: Remediating
reason: NHC is remediating 2 nodes
this blocks all updates and deletion of the NHC resource, since the validating webhook thinks a remediation is still in progress and responds with:
admission webhook "vnodehealthcheck.kb.io" denied the request: selector update prohibited due to running remediation
am I missing a configuration to signal NHC of these deletions?