kuberay
kuberay copied to clipboard
[Bug] Investigate slow Ray pod termination
Search before asking
- [X] I searched the issues and found no similar issues.
KubeRay Component
ray-operator
What happened + What you expected to happen
I've noticed that Ray pods often take a long time to terminate (minutes) after deleting a RayCluster CR. We should investigate why that is the case.
Reproduction script
Tear down a Ray cluster by deleting a RayCluster CR and observe the Ray pod's state, e.g. with watch -n 1 kubectl get pod.
It might take a few minutes for the pod to terminate. There's no reason for a Ray pod to take so long to process sigterm and exit cleanly.
Anything else
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
I don't observe this issue at the moment; it takes 30 seconds to terminate both the Ray head and worker Pods.
I guess 30 seconds is better than the minutes claimed in the issue description. But why does it take a full 30 seconds to do the termination? That sounds like the default K8s grace period, which suggests it requires a SIGKILL to pid 1 to stop a Ray pod.
Probably the process running ray start --block does not clean up its child processes when it receives a SIGTERM.
There's at least no SIGTERM handling that I can see in the code.
I vaguely recall complaining about a similar issue to @rickyyx
But why does it take a full 30 seconds to do the termination? That sounds like the default K8s grace period,
It makes sense to me. The kubectl has the default grace period 30 seconds (doc), but I am not sure whether client-go has the same behavior or not.
Probably the process running ray start --block does not clean up its child processes when it receives a SIGTERM.
I think so.
I believe ray is treating SIGTERM as expected exit codes for ray start --block, I guess it's always the SIGKILL that's killing the ray pod.
I believe ray is treating SIGTERM as expected exit codes for
ray start --block, I guess it's always the SIGKILL that's killing the ray pod.
Is there any handling of a SIGTERM to the "ray start --block" process itself? This is what we need for correct termination on K8s.
i see, so kuberay sends a SIGTERM to the entrypoint process itself?
i see, so kuberay sends a SIGTERM to the entrypoint process itself?
More or less.
Technically, Kubernetes (even more specifically, the Kubelet) sends the SIGTERM when the KubeRay operator (or any other agent) marks the pod for deletion. Then the Kubelet waits a configurable timeout, then it sends SIGKILL. I am personally a little hazy on how Kubernetes handles the non-entrypoint processes; it might depend on the choice of container runtime.
I see, I did a local test - I think sending the SIGTERM to the entrypoint process (ray start --block) does exit the process. But if the SIGTERM is sent to other ray processes like raylet, that does not exit the entrypoint process.
I think sending the SIGTERM to the entrypoint process (ray start --block) does exit the process
It definitely would exit the Python interpreter! But I bet Ray would continue to run after you do that.
I think sending the SIGTERM to the entrypoint process (ray start --block) does exit the process
It definitely would exit the Python interpreter! But I bet Ray would continue to run after you do that.
It wasn't running on my case. Worth validating with a kuberay pod.