Ahmed Ezzat
Ahmed Ezzat
I'm also experiencing this using `2023.3.2`
Is it possible to allow the scheduler to delete the deployment directly? maybe with something like a plugin or so?
> Are you seeing this in practice? Or is this a hypothetical race condition? Yes, I'm seeing this during large-scale cluster scaling e.g. 100-200 workers
Everything looks good at least for now. I deployed the change to our production cluster and will provide updates if needed. Hopefully, everything works well 😄
Surprisingly I had the most stable run ever. One note to mention is if a pod is restarting which means it's `deployment` in an unready state a small possibility might...
Done! now the operator takes actions based on pending pods rather than the deployments
Sorry for the long delay, unfortunately, I don't have much time to work on this. meanwhile, I'll close this PR to allow someone else to finish this
@jacobtomlinson I believe this is ready to merge
It is something related to building some go code. I tried to check what is wrong but didn't figure it out. it appears to be something related to the CI...
@jacobtomlinson If possible can you suggest any solution to solve this issue?