kuberay icon indicating copy to clipboard operation
kuberay copied to clipboard

[Bug] Fix RayJob with an overridden app.kubernetes.io/name (#2147)

Open rueian opened this issue 1 year ago • 0 comments

Why are these changes needed?

The current client.MatchingLabels(common.HeadServiceLabels(*instance)) includes the app.kubernetes.io/name and app.kubernetes.io/created-by labels with their default values to find the Head service of a RayCluster. These default values make the controller fail to find the Head service if they are overridden. We should not rely on labels that can be overridden by users to find the Head service.

This PR replaces the current behavior with a new association function RayClusterHeadServiceListOptions that only uses the following labels to find the Head service.

utils.RayClusterLabelKey:  instance.Name,
utils.RayNodeTypeLabelKey: string(rayv1.HeadNode),
utils.RayIDLabelKey:       utils.CheckLabel(utils.GenerateIdentifier(instance.Name, rayv1.HeadNode)),

The PR also adds a new e2e test ray-job.custom-k8s-app.yaml showing that it works and fixes the #2147.

image

Related issue number

#2147

Checks

  • [x] I've made sure the tests are passing.
  • Testing Strategy
    • [x] Unit tests
    • [x] Manual tests
    • [ ] This PR is not tested :(

rueian avatar May 24 '24 14:05 rueian