cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

🐛 Fix the description of ControlPlaneInitializedCondition is misleading

Open 13164815445 opened this issue 3 years ago • 10 comments

…ncileControlPlane

What this PR does / why we need it:

cluster ControlPlaneInitializedCondition reports if the cluster's control plane has been initialized such that the cluster's apiserver is reachable and at least one control plane Machine has a node reference, but in the scenario where ControlPlaneRef is not nil, we cannot determine whether there is a control plane machine, and we can only judge by the ControlPlaneInitializedCondition of the control plane, so I think it is necessary to modify the description of cluster ControlPlaneInitialized

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #4936

13164815445 avatar Aug 02 '22 03:08 13164815445

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: 13164815445 / name: luyu (42d1bb7307719043cd2f8229b4482272a3bef98b)

Welcome @13164815445!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Aug 02 '22 03:08 k8s-ci-robot

Hi @13164815445. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 02 '22 03:08 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval by writing /assign @neolit123 in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Aug 02 '22 03:08 k8s-ci-robot

/easycla

13164815445 avatar Aug 02 '22 03:08 13164815445

/retest

killianmuldoon avatar Aug 03 '22 12:08 killianmuldoon

I'm re-running the tests in case there's a flake, but I think the implementation here doesn't match the path laid out in #4936. @13164815445 there's a guide to running the end to end tests locally in the Cluster API book if you want to drill down on what the root cause of the test failure is.

killianmuldoon avatar Aug 03 '22 12:08 killianmuldoon

I think we cannot just remove setting the condition because, as Vince mentioned, the two codepaths are mutual exclusive.

https://github.com/kubernetes-sigs/cluster-api/blob/7f879be68d15737e335b6cb39d380d1d163e06e6/controllers/cluster_controller.go#L477-L482

Will get executed when not using a ControlPlane provider

https://github.com/kubernetes-sigs/cluster-api/blob/bfc6f80add5c21b8dc2b704951f42bc14708ebc4/controllers/cluster_controller_phases.go#L238-L247

Otherwise.

So we still need both codepaths if I got it right.

Would be great to replicate the issue, check which of both codepaths is executed in the described case of the issue and check what could be done to only mark it to true if is okay to do so.

chrischdi avatar Aug 03 '22 13:08 chrischdi

I'm re-running the tests in case there's a flake, but I think the implementation here doesn't match the path laid out in #4936. @13164815445 there's a guide to running the end to end tests locally in the Cluster API book if you want to drill down on what the root cause of the test failure is.

This is my problem, I didn't take into account the difference between setting and not setting controlplaneref

I want to add machine judgment in the reconcileControlPlane method, but I am not sure if there is a control plane machine in the scene if ControlPlaneRef is not nil

13164815445 avatar Aug 04 '22 09:08 13164815445

I think we cannot just remove setting the condition because, as Vince mentioned, the two codepaths are mutual exclusive.

https://github.com/kubernetes-sigs/cluster-api/blob/7f879be68d15737e335b6cb39d380d1d163e06e6/controllers/cluster_controller.go#L477-L482

Will get executed when not using a ControlPlane provider

https://github.com/kubernetes-sigs/cluster-api/blob/bfc6f80add5c21b8dc2b704951f42bc14708ebc4/controllers/cluster_controller_phases.go#L238-L247

Otherwise.

So we still need both codepaths if I got it right.

Would be great to replicate the issue, check which of both codepaths is executed in the described case of the issue and check what could be done to only mark it to true if is okay to do so.

I think you are right, we still need both codepaths

If ControlPlaneInitializedCondition only means that the control plane is available, then it is correct now, I am wondering if I just need to change the comment

If a machine must be guaranteed to be available, then we must ensure that controlplaneref is not nil, and the control plane machine must exist

13164815445 avatar Aug 04 '22 10:08 13164815445

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 20 '22 15:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Dec 20 '22 16:12 k8s-triage-robot

@13164815445 do you think you'll have time to come back to this PR? If not we can close and let someone else pick it up.

killianmuldoon avatar Dec 20 '22 16:12 killianmuldoon

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Jan 19 '23 17:01 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 19 '23 17:01 k8s-ci-robot