Uncordon immediately after drain failure
We've been running Kured for some time and recently noticed a change in behavior starting with version 1.16.2. When a node fails to drain, it is now uncordoned only after the releaseDelay period.
Previously (e.g., in version 1.16.1 and earlier), the node was uncordoned immediately after the drain failure. This behavior was preferable for our use case, as it allows workloads to be rescheduled quickly if a reboot cannot proceed.
It appears this change was introduced in this commit.
We would like to request an option or a fix to restore the previous behavior—immediate uncordon on drain failure—to minimize disruption in scenarios where the reboot cannot be completed.
I like the idea. This is a bug indeed. It was not covered by a test. Can you introduce a test too ?
I would try to work on this if I have time.
Hello @evrardjp . I finally get the time for working on this issue. Is this still relevant?
Also, we are interesting of updating to the latest version. I'm seeing that you are working on #1000 how could we coordinate to get this fixed?