kuberay [Feature] [Helm] KubeRay Helm chart support multiple worker group defs

Why are these changes needed?

The KubeRay CRD allows for multiple worker groups, however, as it stands creating more worker groups is a manual process.

This PR addresses that by allowing worker groups to be defined via maps in the values.yaml

It also moves a lot of boilerplate into _ray-cluster.tpl. In addition, allows for no in tree autoscaler and resources for all possible containers.

Related issue number

Possible solution for #255

Checks

[x] I've made sure the tests are passing.
Testing Strategy
- [ ] Unit tests
- [x] Manual tests
- [ ] This PR is not tested :(

May 17 '22 14:05 calizarr

@calizarr a quick question.

Line 41 in values.yaml you make reference to the head.initArgs.num_cpus: 0 to make sure the head node has no compute resource attributed to the cluster.

However, on the following lines, you also put head.resources.cpu.requests: 0 and head.resources.cpu.limits: 0 as well, and state that is so the ray cluster doesn't have resource. Are these fields used by ray for purposes of attribution to the ray cluster resources or are they used to tell the Kubernetes autoscaler what resources the given pod requires?

In my example, I have always set num_cpus: 0, however, I have set the cpu.requests and cpu.limits, and the same for memory, to be a reasonable requirement for running the head node.

This results in a head node pod that does not have any cpu resources showing in the ray cluster.

head:
  groupName: headgroup
  replicas: 1
  type: head
  labels:
    key: value
  initArgs:
    port: '6379'
    redis-password: 'LetMeInRay' # Deprecated since Ray 1.11 due to GCS bootstrapping enabled
    dashboard-host: '0.0.0.0'
    num-cpus: '0' # can be auto-completed from the limits
    node-ip-address: $MY_POD_IP # auto-completed as the head pod IP
    block: 'true'
  containerEnv:
    - name: MY_POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
  envFrom: []
    # - secretRef:
    #     name: my-env-secret
  resources:
    limits:
      cpu: 3
      memory: 3064Mi
    requests:
       cpu: 3
       memory: 3064Mi
  annotations: {}
  nodeSelector: {"agentpool": "agentpool"}
  tolerations: []
  affinity: {}
  volumes:
    - name: log-volume
      emptyDir: { }
  volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume

May 18 '22 09:05 ecm200

So these are, it seems, Ray only environment variables: https://github.com/ray-project/kuberay/pull/266/files#diff-e47b74595d2e73c27d9519ea8a79afc13d38a2bfa2da5e196756ca6e89037d27R2-R5

So that's what those two set, whereas, the others are for the actual k8s request. If k8s receives a request/limit of 0 it won't give that Pod any resources at all and it will never launch.

https://github.com/ray-project/kuberay/pull/266/files#diff-ad1dc7aa5ec069f4705e1ba0794ebce1a8c6b6b0f1e40d3a53c8a361456127c7R60-R66

I do seem to have used num_cpus twice though...

May 18 '22 15:05 calizarr

So these are, it seems, Ray only environment variables: https://github.com/ray-project/kuberay/pull/266/files#diff-e47b74595d2e73c27d9519ea8a79afc13d38a2bfa2da5e196756ca6e89037d27R2-R5

So that's what those two set, whereas, the others are for the actual k8s request. If k8s receives a request/limit of 0 it won't give that Pod any resources at all and it will never launch.

https://github.com/ray-project/kuberay/pull/266/files#diff-ad1dc7aa5ec069f4705e1ba0794ebce1a8c6b6b0f1e40d3a53c8a361456127c7R60-R66

I do seem to have used num_cpus twice though...

@calizarr ok, so my configuration settings appears to make sense in that context then. The num-cpus argument is given to the autoscaler on execution of the head node, and others are the resources that are required to run the pod itself and therefore communicated to the Kubernetes system for node allocation.

May 18 '22 15:05 ecm200

@ecm200, I've simplified it hopefully if you don't mind taking a look.

May 18 '22 15:05 calizarr

cc @zhuangzhuang131419 (original helm chart author) for thoughts

May 19 '22 01:05 DmitriGekhtman

@ecm200, I've simplified it hopefully if you don't mind taking a look.

I think it looks good.

Just to completely clear, does this implementation de-couple the num-cpus specification from the cpu.limits and cpu.requests for the head node?

Looking at the implementation wasn't entirely sure that was the case. And as I said previously, my understanding was that you want to be able to specify the requests and limits, whilst setting the num-cpus to 0, so that you give the head node adequate resource to run well, but don't want any of it being available to the Ray cluster as compute resource.

May 19 '22 15:05 ecm200

@DmitriGekhtman on the face of this submission, I think this is a much nicer implementation than my #264 PR, so I am happy to can that in deference to this one?

May 19 '22 15:05 ecm200

@ecm200, so the ray.head.num_cpus is used here for the Ray Cluster: https://github.com/ray-project/kuberay/blob/0ae90b6b3f89b2c988af40e06083649811c61ac6/helm-chart/ray-cluster/templates/_ray-cluster.tpl

- name: CPU_REQUEST
  value: {{ .Values.ray.head.num_cpus }}
- name: CPU_LIMITS
  value: {{ .Values.ray.head.num_cpus }}

and here as a CLI flag passed in using a yaml anchor, https://github.com/ray-project/kuberay/blob/0ae90b6b3f89b2c988af40e06083649811c61ac6/helm-chart/ray-cluster/values.yaml

  head:
    replicas: 1
    num_cpus: &headCPU 0 # Leave the at 0 if the head node should not do work.
    groupName: headgroup
    type: head
    dashboardPort: &dashPort 8265
    initArgs:
      port: '6379'
      dashboard-host: '0.0.0.0'
      dashboard-port: *dashPort
      num-cpus: *headCPU  # can be auto-completed from the limits (leave at 0 for no work done on head node)
      node-ip-address: $MY_POD_IP # auto-completed as the head pod IP
      block: 'true'

In the same file under ray.head.resources you set the actual resources you want for the head node on kubernetes.

May 19 '22 16:05 calizarr

I still need to take a closer look.

After the changes, (1) Is the same default configuration generated as before? (2) Is an old values.yaml compatible with the new chart?

(1) is probably important Not sure about (2) -- don't know if we've worked out the versioning story.

May 19 '22 16:05 DmitriGekhtman

(1) I can definitely figure out if the default configuration can be the same one as generated before maybe with some additions but some if statements can handle that.

(2) It's...distinct enough this might not be possible though I can take a stab at it.

May 19 '22 17:05 calizarr

I have closed #264 due to this PR encompassing that functionality and more general refactoring.

May 24 '22 09:05 ecm200

@calizarr thanks for the contribution!

Also, sorry about the delay getting back to you on this.

We ultimately decided to go with the implementation in https://github.com/ray-project/kuberay/pull/451. Thus, I'll close this PR.

Sep 12 '22 19:09 DmitriGekhtman