kuberay icon indicating copy to clipboard operation
kuberay copied to clipboard

Support Apache YuniKorn as one batch scheduler option

Open yangwwei opened this issue 1 year ago • 2 comments
trafficstars

Why are these changes needed?

Apache YuniKorn is a widely used batch scheduler for Kubernetes, this PR is to support Apache yunikorn as a option for scheduling Ray workloads.

The integration is very simpler, Apache YuniKorn doesn't require any CR to be created, the changes in the job controller code is to automatically inject required labels to Ray pods, only 2 extra lables are needed

  • yunikorn.apache.org/application-id
  • yunikorn.apache.org/queue-name

when all pods have the above labels, the yunikorn scheduler will automatically recognize these pods belong to the same Ray application, and schedule them in the given queue. Then the Ray workload can benifit all batch scheduling features yunikorn provided: https://yunikorn.apache.org/docs/next/get_started/core_features

Related issue number

https://github.com/ray-project/kuberay/issues/1457

Checks

  • [ ] I've made sure the tests are passing.
  • Testing Strategy
    • [v] Unit tests
    • [v] Manual tests
    • [ ] This PR is not tested :(

yangwwei avatar Jun 09 '24 22:06 yangwwei

Hi @yangwwei, thank you for the PR! Are you in the Ray Slack workspace? My Slack handle is "Kai-Hsun Chen (ray team)" We can have a quick sync on Slack to discuss how the KubeRay/Ray community works (e.g., how to propose a new enhancement).

kevin85421 avatar Jun 10 '24 16:06 kevin85421

@kevin85421 please see proposal: https://github.com/ray-project/enhancements/pull/53

yangwwei avatar Jun 17 '24 06:06 yangwwei

Hi @yangwwei, I plan to review this PR next week because the REP has already been merged. Is this PR ready for review? I see it is still marked as a draft.

kevin85421 avatar Jul 07 '24 22:07 kevin85421

hi @kevin85421 can you help to review this PR please, thanks!

yangwwei avatar Jul 17 '24 16:07 yangwwei

I will review the PR tmr. Thanks!

kevin85421 avatar Jul 18 '24 23:07 kevin85421

Could you (1) fix the CI lint error (install the pre-commit hooks) (2) add some instructions about how do you manually test it with Yunikorn in the PR description? I will also try it manually. Thanks!

kevin85421 avatar Jul 22 '24 07:07 kevin85421

Prerequisits:

  • a local Kind cluster (or a real k8s cluster)
  • ray-operator image built
  • comment out this line to workaround this issue:

Install kuberay

The docker image needs to be pushed to the kind registry first

helm install kuberay-operator kuberay/kuberay-operator \
   --version 1.0.0 --set batchScheduler.enabled=true \
   --set image.repository=kind-registry.vsl --set image.tag=5000/kuberayv1

the log should mention the batch scheduler is enabled:

{"level":"info","ts":"2024-07-23T22:26:41.945Z","logger":"setup","msg":"Feature flag enable-batch-scheduler is enabled."}

Install yunikorn

doc: https://yunikorn.apache.org/docs/#install, note, I reduced the memory request to fit my local env

helm repo add yunikorn https://apache.github.io/yunikorn-release
helm repo update
kubectl create namespace yunikorn
helm install yunikorn yunikorn/yunikorn --namespace yunikorn --set resources.requests.memory=200M --set web.resources.requests.memory=50M

Test

Run a simple Ray cluster, this is what I was using:

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  annotations:
    meta.helm.sh/release-name: raycluster
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2024-01-12T19:14:07Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: raycluster
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kuberay
    helm.sh/chart: ray-cluster-1.0.0
    ray.io/scheduler-name: yunikorn
    yunikorn.apache.org/application-id: my-ray-cluster-0001
  name: raycluster-kuberay
  namespace: default
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: 0.0.0.0
    serviceType: ClusterIP
    template:
      metadata:
        labels:
          app.kubernetes.io/instance: raycluster
          app.kubernetes.io/managed-by: Helm
          app.kubernetes.io/name: kuberay
          helm.sh/chart: ray-cluster-1.0.0
      spec:
        containers:
        - env: []
          image: rayproject/ray:2.7.0
          imagePullPolicy: IfNotPresent
          name: ray-head
          resources:
            limits:
              cpu: "1"
            requests:
              cpu: "1"
          volumeMounts:
          - mountPath: /tmp/ray
            name: log-volume
        tolerations:
        - effect: NoSchedule
          key: kwok.x-k8s.io/node
          operator: Equal
          value: fake
        volumes:
        - emptyDir: {}
          name: log-volume
  workerGroupSpecs:
  - groupName: workergroup
    rayStartParams: {}
    maxReplicas: 2147483647
    minReplicas: 0
    replicas: 1
    template:
      metadata:
        labels:
          app.kubernetes.io/instance: raycluster
          app.kubernetes.io/managed-by: Helm
          app.kubernetes.io/name: kuberay
          helm.sh/chart: ray-cluster-1.0.0
      spec:
        containers:
        - env: []
          image: rayproject/ray:2.7.0
          imagePullPolicy: IfNotPresent
          name: ray-worker
          resources:
            limits:
              cpu: "1"
            requests:
              cpu: "1"
          volumeMounts:
          - mountPath: /tmp/ray
            name: log-volume
        tolerations:
        - effect: NoSchedule
          key: kwok.x-k8s.io/node
          operator: Equal
          value: fake
        volumes:
        - emptyDir: {}
          name: log-volume

once applied, we should see the pods being scheduled by yunikorn, verify this by describing the head and worker pods, you'll see events like the belowing:

  Type    Reason             Age   From      Message
  ----    ------             ----  ----      -------
  Normal  Scheduling         14s   yunikorn  default/raycluster-kuberay-head-tvtn4 is queued and waiting for allocation
  Normal  Scheduled          14s   yunikorn  Successfully assigned default/raycluster-kuberay-head-tvtn4 to node kind-worker
  Normal  PodBindSuccessful  14s   yunikorn  Pod default/raycluster-kuberay-head-tvtn4 is successfully bound to node kind-worker
  Normal  Pulling            14s   kubelet   Pulling image "rayproject/ray:2.7.0"

yangwwei avatar Jul 23 '24 22:07 yangwwei

Could you fix the lint error? You can refer to this doc to install pre-commit https://github.com/ray-project/kuberay/blob/master/ray-operator/DEVELOPMENT.md

kevin85421 avatar Jul 24 '24 06:07 kevin85421

Btw, I hope to include this PR in v1.2.0. I will do the branch cut next week.

kevin85421 avatar Jul 24 '24 06:07 kevin85421

I test it manually, and I can see events from yunikorn in both head and worker Pods. I am wondering what's the scheduling strategy (e.g. gang scheduling) in this example.

Screenshot 2024-07-24 at 12 30 33 PM Screenshot 2024-07-24 at 12 31 32 PM

kevin85421 avatar Jul 24 '24 19:07 kevin85421

I test it manually, and I can see events from yunikorn in both head and worker Pods. I am wondering what's the scheduling strategy (e.g. gang scheduling) in this example.

Gang scheduling support is not included in this PR yet, I will work on that after this gets merged. I intended to keep the PRs small for easier review.

I remember you mentioning that if we enable the batch scheduler without installing the Volcano CRD, it will report an error. Have we resolved this issue?

Yes, thats still an issue. I will work on another PR with the proposed solution.

yangwwei avatar Jul 24 '24 19:07 yangwwei