kueue icon indicating copy to clipboard operation
kueue copied to clipboard

Support KubeRay RayService as a Kueue workload

Open andrewsykim opened this issue 6 months ago • 4 comments

What would you like to be added:

Today Kueue supports RayJob and RayCluster as a supported workload but does not support RayService. I've heard feedback from some KubeRay users asking for RayService support. Similar to RayCluster support, we should support RayService as a kueue-able workload but without autoscaling support.

Why is this needed:

RayService is the only KubeRay resource not supported by Kueue. We should support it for full feature parity with KubeRay.

Completion requirements:

This enhancement requires the following artifacts:

  • [ ] Design doc
  • [ ] API change
  • [X] Docs update

The artifacts should be linked in subsequent comments.

andrewsykim avatar May 30 '25 20:05 andrewsykim

+1 Advanced scheduling features like Topology Aware Scheduling (TAS) and All-or-Nothing with Ready Pods is essential in production-grade inference workloads.

kimminw00 avatar Jun 04 '25 02:06 kimminw00

@weizhaowz do you have cycles to implement this?

andrewsykim avatar Jun 10 '25 16:06 andrewsykim

@weizhaowz do you have cycles to implement this?

Yes, I can do it.

weizhaowz avatar Jun 10 '25 17:06 weizhaowz

Thank you folks for driving that!

mimowo avatar Jun 26 '25 09:06 mimowo

Initially I tried add RayService controller, webhook and multikueue-adapter in pr, but in testing, I found the the RayCluster created for the RayService cannot be updated as the RayCluster is managed by its own controller, so KubeRay cannot provision the RayCluster. Therefore, we decide to let Kueue manage RayService through RayCluster, and this pr contains details

weizhaowz avatar Jul 07 '25 17:07 weizhaowz

Thanks @weizhaowz

/close

andrewsykim avatar Jul 07 '25 18:07 andrewsykim

@andrewsykim: Closing this issue.

In response to this:

Thanks @weizhaowz

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 07 '25 18:07 k8s-ci-robot