kuberay icon indicating copy to clipboard operation
kuberay copied to clipboard

[Bug] HeadGroupSpec.Replicas is unused

Open kevin85421 opened this issue 3 years ago • 7 comments
trafficstars

Search before asking

  • [X] I searched the issues and found no similar issues.

KubeRay Component

ray-operator

What happened + What you expected to happen

This bug is found by #605. When I updated the value of headGroupSpec.replicas from 1 to 2. It breaks this rule. The number of workers is determined by worker.Replicas (code), but HeadGroupSpec.Replicas is unused in KubeRay.

Possible solutions: (1) Remove the HeadGroupSpec.Replicas field. (2) Create head pods based on HeadGroupSpec.Replicas.

Reproduction script

  • Updated the value of headGroupSpec.replicas in your YAML file from 1 to 2.

Anything else

No response

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

kevin85421 avatar Oct 04 '22 00:10 kevin85421

cc @DmitriGekhtman

kevin85421 avatar Oct 04 '22 00:10 kevin85421

Thank you for tracking this! See also https://github.com/ray-project/kuberay/pull/362, which made the field optional.

I think we should remove the field in the next RayCluster API version.

DmitriGekhtman avatar Oct 04 '22 01:10 DmitriGekhtman

So the solution is bullet (2) in your proposal. I believe I've already removed the field in all example docs and documented the field as deprecated in the raycluster_types.go.

DmitriGekhtman avatar Oct 04 '22 01:10 DmitriGekhtman

Oh, actually I think I left it in the examples for compatibility with KubeRay 0.2.0. But now that we've released KubeRay 0.3.0, we can remove the field from all examples! @kevin85421 would you mind doing the honors of hunting down and deleting all instances of HeadGroupSpec.replicas in example yamls?

DmitriGekhtman avatar Oct 04 '22 01:10 DmitriGekhtman

Thank @DmitriGekhtman for the reply! Why do not we remove HeadGroupSpec.Replicas from raycluster_types.go? I tried to remove the field from raycluster_types.go, and it can still keep backward compatibility.

# Step 1: Remove  `HeadGroupSpec.Replicas` from `raycluster_types.go`
# Step 2: Update CRD / generated API
make sync

# Step 3: zz_generated_deepcopy.go
make generate 

# Step4: Build docker image
make docker-image

# Step 5: Install new KubeRay operator (controller:latest) and new CRD
# Step 6: Install RayCluster with `HeadGroupSpec.Replicas`

# Work well.

kevin85421 avatar Oct 04 '22 18:10 kevin85421

We have an issue to track it and we need to do some clean up on the api.

Jeffwan avatar Oct 04 '22 21:10 Jeffwan

Related issue: #368

kevin85421 avatar Oct 05 '22 21:10 kevin85421