cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

:bug: create bootstrap token if not found in refresh process

Open archerwu9425 opened this issue 1 year ago • 6 comments

What this PR does / why we need it: For the refreshBootstrapTokenIfNeeded function, if token not found, should create a new one instead of just raise error

why we need it:

  1. Bootstrap token is created by bootstrap controller but will delete by workload-cluster
  2. MachinePool is scaled up/down by cluster autoscaler
  3. When cluster is with paused: true filed, reconcile will stop. During this period, bootstrap token may be deleted but kubeadminconfig will not updated.
  4. Instanced in machinepool created during this period will not be able to join cluster, even after cluster remove paused filed and start to reconcile.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #11034

archerwu9425 avatar Aug 12 '24 10:08 archerwu9425

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign fabriziopandini for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Aug 12 '24 10:08 k8s-ci-robot

Hi @archerwu9425. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Aug 12 '24 10:08 k8s-ci-robot

/area bootstrap

archerwu9425 avatar Aug 12 '24 10:08 archerwu9425

seems like this creates a "secret" (bootstrap-token) on demand. i.e. when the user is trying to join node with some invalid BST, the BST will be created for them.

that doesn't seem like the correct thing to do, but interesting to hear more opinions.

If the token kept in kubeadminconfig is not able to find in remote cluster, it will create a new one and update the token in kubeadminconfig and launch template. This will still be handled by bootstrap controller and used for machine pool. Also this is the logic for rotate machine pool bootstrap token, the only difference is the machine pool has nodeRef or not, code block to be referred:

https://github.com/kubernetes-sigs/cluster-api/blob/main/bootstrap/kubeadm/internal/controllers/kubeadmconfig_controller.go#L276-L286

https://github.com/kubernetes-sigs/cluster-api/blob/main/bootstrap/kubeadm/internal/controllers/kubeadmconfig_controller.go#L379-L391

archerwu9425 avatar Aug 12 '24 14:08 archerwu9425

@neolit123 @ncdc @greut Could you please help review? Thanks

archerwu9425 avatar Aug 15 '24 01:08 archerwu9425

seems like this creates a "secret" (bootstrap-token) on demand. i.e. when the user is trying to join node with some invalid BST, the BST will be created for them. that doesn't seem like the correct thing to do, but interesting to hear more opinions.

my comment is here. waiting for comments from more maintainers.

neolit123 avatar Aug 15 '24 07:08 neolit123

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 13 '24 07:11 k8s-triage-robot

/ok-to-test

fabriziopandini avatar Dec 03 '24 09:12 fabriziopandini

@AndiDog Great to know, and you have got all the test ready, please go on with your PR, nice to have the issue be fixed soon

Considering the comment above /close

fabriziopandini avatar Dec 06 '24 10:12 fabriziopandini

@fabriziopandini: Closed this PR.

In response to this:

@AndiDog Great to know, and you have got all the test ready, please go on with your PR, nice to have the issue be fixed soon

Considering the comment above /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Dec 06 '24 10:12 k8s-ci-robot