kuberay icon indicating copy to clipboard operation
kuberay copied to clipboard

Ray serve gke gateway ingress

Open ravishtiwari opened this issue 1 year ago • 1 comments
trafficstars

Why are these changes needed?

This PR contains an example of using Kubernetes Gateway API to expose RayServe Service on GKE with Gateway and HTTP Route Deployment. Gateway API is the successor of Ingress and the newest way of Exposing services running in the same or multiple namespaces. Gateway API offers multiple benefits and is both powerful and flexible at the same time. This Kubernetes page provides a list of reasons why one should switch to GatewayAPI

Related issue number

As far as I know, there is currently no logged issue for this.

Checks

  • [x] I've made sure the tests are passing - Yes, I have verified everything I am pushing.
  • Testing Strategy
    • [ ] Unit tests: Not needed, example
    • [x] Manual tests: Yes, deployed and tested on GKE
    • [ ] This PR is not tested :

ravishtiwari avatar Mar 09 '24 17:03 ravishtiwari

cc @robscott @aojea

andrewsykim avatar Mar 11 '24 02:03 andrewsykim

I will review this PR this week.

kevin85421 avatar Apr 02 '24 20:04 kevin85421

@kevin85421 last time we met you expressed some interest about some limitations of current Services and Endpoints, do you think you can provide some steps with existing example to reproduce the problem so I can understand it better?

aojea avatar Apr 03 '24 20:04 aojea

@aojea - can you share some of the limitations? Are they related to HealthCheck or Cross Namespace routes or general stability concerns? May be I have encountered some of them, so, just curious to know.

ravishtiwari avatar Apr 04 '24 17:04 ravishtiwari

Sorry I realize my comment is confusing, I was referring to Services and EndpointSlices, as the way of balancing the traffic is randomly spread across the ready backends, but it seems that for these workloads more advanced heuristics for load balancing are needed and Gateway offers the capability of set weights on the routes,

aojea avatar Apr 07 '24 19:04 aojea

Sorry I realize my comment is confusing, I was referring to Services and EndpointSlices, as the way of balancing the traffic is randomly spread across the ready backends, but it seems that for these workloads more advanced heuristics for load balancing are needed and Gateway offers the capability of set weights on the routes,

Yes, and Gateway offers header-based routes, is extensible, and supports multiple protocol routes. Some users might find it not that straightforward compared to Ingress, but, I believe, these issues would be ironed out.

ravishtiwari avatar Apr 08 '24 10:04 ravishtiwari

lgtm

aojea avatar Apr 08 '24 15:04 aojea

last time we met you expressed some interest about some limitations of current Services and Endpoints, do you think you can provide some steps with existing example to reproduce the problem so I can understand it better?

@aojea thank you for the questions!

  • Current implementation (both old / new RayCluster use a K8s service)
    • Currently, the RayService CRD supports zero-downtime upgrades. During the upgrade process, Ray creates a new RayCluster CR and switches the traffic to the new RayCluster CR after the Ray Serve applications on the new cluster are ready to serve requests.
    • This solution requires twice the computational resources at its peak. Hence, we want to support incremental upgrade.
  • Incremental upgrade
    • We want K8s service to achieve:
      • In Ray Serve, each Ray Pod has at most 1 proxy actor. We want all proxy actors in the same RayCluster serve the same ratio of requests.
      • We can attach both the old and new RayCluster CRs to the same Kubernetes service. However, we can divide the endpoints into two sets based on their associated RayCluster. This allows us to control how many requests are directed to each RayCluster CR.

kevin85421 avatar Apr 08 '24 17:04 kevin85421

Sorry, I haven't had time to set up the Gateway manually in the past week. This PR contains only YAML changes, so it should be safe to merge. @ravishtiwari, would you be interested in contributing documentation to the Ray repository on how to run this example https://docs.ray.io/en/latest/cluster/kubernetes/examples.html step by step? Thanks!

Sure @kevin85421 - I can do it this week, will send you another PR for documentation updates.

ravishtiwari avatar Apr 09 '24 04:04 ravishtiwari