kuberay
kuberay copied to clipboard
[Bug] Fix RayJob with an overridden app.kubernetes.io/name (#2147)
Why are these changes needed?
The current client.MatchingLabels(common.HeadServiceLabels(*instance)) includes the app.kubernetes.io/name and app.kubernetes.io/created-by labels with their default values to find the Head service of a RayCluster. These default values make the controller fail to find the Head service if they are overridden. We should not rely on labels that can be overridden by users to find the Head service.
This PR replaces the current behavior with a new association function RayClusterHeadServiceListOptions that only uses the following labels to find the Head service.
utils.RayClusterLabelKey: instance.Name,
utils.RayNodeTypeLabelKey: string(rayv1.HeadNode),
utils.RayIDLabelKey: utils.CheckLabel(utils.GenerateIdentifier(instance.Name, rayv1.HeadNode)),
The PR also adds a new e2e test ray-job.custom-k8s-app.yaml showing that it works and fixes the #2147.
Related issue number
#2147
Checks
- [x] I've made sure the tests are passing.
- Testing Strategy
- [x] Unit tests
- [x] Manual tests
- [ ] This PR is not tested :(