kuberay
kuberay copied to clipboard
[Bug] I got Reconciler error when change the value of nameOverride in values.yaml of helm installation Ray Cluster
Search before asking
- [X] I searched the issues and found no similar issues.
KubeRay Component
ray-operator
What happened + What you expected to happen
I tried to install a Ray cluster by using heml. In the values.yaml of ray-cluster, I changed the values of nameOverride from "kuberay" to"mykuberay". Then on the log of kuberay-operator pod, it got bellow logs:
2023-12-01T09:18:50.777Z INFO controllers.RayCluster Read request instance not found error! {"name": "kuberay-dev/raycluster-kuberay"}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "raycluster-mykuberay"}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster Reconciling Ingress
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcileHeadService {"1 head service found": "raycluster-mykuberay-head-svc"}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcilePods {"Found 1 head Pod": "raycluster-mykuberay-head-qd9b8", "Pod status": "Running", "Pod restart policy": "Always", "Ray container terminated status": "nil"}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcilePods {"head Pod": "raycluster-mykuberay-head-qd9b8", "shouldDelete": false, "reason": "KubeRay does not need to delete the head Pod raycluster-mykuberay-head-qd9b8. The Pod status is Running, and the Ray container terminated status is nil."}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcilePods {"desired workerReplicas (always adhering to minReplicas/maxReplica)": 0, "worker group": "cpuGroup", "maxReplicas": 3, "minReplicas": 0, "replicas": 0}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcilePods {"removing the pods in the scaleStrategy of": "cpuGroup"}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcilePods {"workerReplicas": 0, "runningPods": 0, "diff": 0}
2023-12-01T09:19:39.202Z INFO controllers.RayCluster reconcilePods {"all workers already exist for group": "cpuGroup"}
2023-12-01T09:19:39.203Z INFO controllers.RayCluster Got error when calculating new status {"cluster name": "raycluster-mykuberay", "error": "unable to find head service. cluster name raycluster-mykuberay, filter labels map[app.kubernetes.io/created-by:kuberay-operator app.kubernetes.io/name:kuberay ray.io/cluster:raycluster-mykuberay ray.io/identifier:raycluster-mykuberay-head ray.io/node-type:head]"}
2023-12-01T09:19:39.203Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "raycluster-mykuberay", "namespace": "kuberay-dev", "error": "unable to find head service. cluster name raycluster-mykuberay, filter labels map[app.kubernetes.io/created-by:kuberay-operator app.kubernetes.io/name:kuberay ray.io/cluster:raycluster-mykuberay ray.io/identifier:raycluster-mykuberay-head ray.io/node-type:head]"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/opt/app-root/src/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/opt/app-root/src/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227
Reproduction script
We can see the default values of nameOverride as following link: https://github.com/ray-project/kuberay-helm/blob/07463a11e78d934f850f0b4ab20cf3b17803b86c/helm-chart/ray-cluster/values.yaml#L13
I changed this value to anther one, then I got error log in kuberay-operator. Please advise this.
Anything else
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Hi @kevin85421 - Is this currently being worked on? We are currently running into this error also as we deploy using helm and changing the nameOverride value also. If not being worked on, I'm willing to look into and submit a PR. Thanks!
@chrisxstyles, thank you! Welcome your PR!