kuberay
kuberay copied to clipboard
[Feature] Configurable RayCluster readiness definition
Search before asking
- [X] I had searched in the issues and found no similar feature requirement.
Description
Currently RayCluster resource is considered ready when it's created in Kubernetes. Would be great to have an option to consider it ready when a head node is ready and min_replicas count is achieved for each worker group.
Use case
In any automated pipeline we should first create a cluster and then send a payload. This involves step in between when we need to wait until cluster is ready to get payload.
Related issues
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
https://github.com/ray-project/kuberay/issues/533
Chatted with @rueian today. Currently, we redefine "ready" with a new RayCluster condition called RayClusterReady. This condition indicates whether all Ray Pods are ready when the RayCluster is first created. After RayClusterReady is set to true for the first time, it only indicates whether the RayCluster's head Pod is ready for requests. The definition of "ready" in the first stage is somewhat "configurable", while in the second stage, it is controlled by the Ray Autoscaler.
If the new definition doesn't work well, we will add a new field for each worker group in CRD to enable users to explicitly define the definition of "ready".