kuberay
kuberay copied to clipboard
[Bug] Autoscaler round up the amount of cpu resources a pod has.
Search before asking
- [X] I searched the issues and found no similar issues.
KubeRay Component
ray-operator
What happened + What you expected to happen
I used the example given at https://raw.githubusercontent.com/ray-project/kuberay/release-0.5/ray-operator/config/samples/ray-cluster.autoscaler.yaml, but changed the pods resources (both request and limit) to 500m cpu.
When trying the acuto scaler with "import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=4)", It added 2 pods (in totatl 4 pods), instead of adding 6 pods (in total 8 pod) as expected. After looking at the code, I saw that you round up the number of cpus https://github.com/ray-project/kuberay/blob/803374eeabf6688c9b241ded0d589de6f9af8078/ray-operator/controllers/ray/common/pod.go#LL743C67-L743C67 .
If I try to set num-cpus parameter to either '500m' or '0.5' the cluster fails to start with: Error: Invalid value for '--num-cpus': '500m' is not a valid integer.
Do all pods of an autoscaler ray cluster expected to have an integer amount of cpu? is there another way I can tell the ray autoscaler that each pod has a floating number of cpus?
Reproduction script
https://raw.githubusercontent.com/ray-project/kuberay/release-0.5/ray-operator/config/samples/ray-cluster.autoscaler.yaml, but changed the pods resources (both request and limit) to 500m cpu.
export HEAD_POD=$(kubectl get pods --selector=ray.io/node-type=head -o custom-columns=POD:metadata.name --no-headers) kubectl exec $HEAD_POD -it -c ray-head -- python -c "import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=4)"
Spawns 2 new pods, not 6.
Anything else
Is there another way I can tell the ray autoscaler that each pod has a floating number of cpus? If not, addeding this note to your autoscaling documentation would be great.
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!