istio icon indicating copy to clipboard operation
istio copied to clipboard

CNI repair broken for non-restartable pods

Open howardjohn opened this issue 1 year ago • 2 comments
trafficstars

When a job has restartPolicy=Never or low backoffLimit, CNI race repair may not work. This is because the pod will not just crashloop until its fixed, it will stop immediately.

This applies even to the new repairPods mode, not just label/delete pods mode, because even repairPods is triggered by the pod crashlooping.

howardjohn avatar Feb 05 '24 18:02 howardjohn

Thinking about it, I guess this is not practically fixable except by adopting

https://github.com/istio/istio/pull/48818 / https://github.com/istio/istio/pull/49092

to entirely obviate the need for crash-based race repair.

bleggett avatar Feb 05 '24 18:02 bleggett

Now that the untaint controller is in I would actually advocate we drop this form of race repair entirely, and point people to istio-cni + the untaint controller instead.

Need to do a bit of work there still tho.

bleggett avatar May 21 '24 16:05 bleggett