cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

BootstrapConfig should be optional for MachinePool based ClusterClass

Open shyamradhakrishnan opened this issue 1 year ago • 19 comments

What steps did you take and what happened?

In managed clusters, while using ClusterClass, we are stuck to create MachinePools as it is forcing providing a BootstrapConfig. In non ClusterClass machinepools, BootstrapConfig is optional. For example https://github.com/oracle/cluster-api-provider-oci/blob/main/templates/cluster-template-managed.yaml#L50

What did you expect to happen?

BootstrapConfig should be optional in MachinePools based on ClusterClass. Specifically https://doc.crds.dev/github.com/kubernetes-sigs/cluster-api/cluster.x-k8s.io/ClusterClass/[email protected]#spec-workers-machinePools-template-bootstrap-ref should be optional

Cluster API version

v1.7.0

Kubernetes version

No response

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

shyamradhakrishnan avatar Jul 26 '24 03:07 shyamradhakrishnan

We have to compare with Machines, but I wonder how "optional" bootstrap config is there. E.g. is it possible to just not set ref, or do folks also have to set an empty dataSecretName field (which seems more like a workaround then bootstrap config properly being optional, like it should be)

But agree with the general idea of this issue.

Would be great if other folks interested in MachinePools would also chime in

/area machinepool /triage accepted

sbueringer avatar Jul 26 '24 08:07 sbueringer

As far as I remember either you define a bootstrap config ref, which generates a bootstrap data secret, or you provide a bootstrap data secret.

If we allow empty data secrets seems more a bug than a feature, but I have to dig into it a little bit more

fabriziopandini avatar Jul 31 '24 14:07 fabriziopandini

@fabriziopandini as long as we have ability to ignore bootstrap config, we should be oke. For example please see this https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/templates/cluster-template-rosa-machinepool.yaml#L65 , all providers do the same for managed machinepools are they dont require bootstrap.

shyamradhakrishnan avatar Jul 31 '24 14:07 shyamradhakrishnan

@mboersma to chime in If we want first-class support for the use case where MP are used without bootstrap config, e.g. for managed machine pools, we probably should think to a better API UX (as well as impacts to MPM)

fabriziopandini avatar Aug 02 '24 12:08 fabriziopandini

Hi @fabriziopandini @mboersma we would like to use ClusterClass with managed machine pools and this is blocking us, do you have any updates?

zihuaweng avatar Aug 21 '24 18:08 zihuaweng

No updates, I can only reiterate/expand on previous comments from an API design stand point:

  • The fact that MP accepts empty bootstrap secret names seems more a bug than an API design choice (we probably should think to a better API UX)
  • Also, making the field optional in CC types is a breaking change. That must be deferred to the next API version (note: iy is tricky for downconversions...).
  • I'm not sure how MP pools without bootstrap secrets can/will be translated into MPM, but I'll defer to @mboersma, @willie-yao to comment about impacts for MPM

cc @JoelSpeed for an opinion about API design

fabriziopandini avatar Aug 27 '24 09:08 fabriziopandini

@fabriziopandini Just looked it up, and I saw that bootstrap.ref is indeed required for MachinePools and MachineDeployments. I assume for MachineDeployments it should definitely be optional? Is this something we should include in the v1beta2 umbrella issue?

sbueringer avatar Aug 27 '24 10:08 sbueringer

I would expect the API to allow you to either specify a bootstrap template, or the data secret directly, and require that at least one of those be required.

Technically making a required field optional can be done without changing the API version, the main thing is to make sure consumers of the API understand the change. Would we say we control the consumers? The controllers? Or is there a wide enough blast radius (many providers) that we would rather be more cautious?

Looking at the MachineSpec used in MD and MP, as far as I can tell, the bootstrap field itself is required, but there are no required fields within it, that seems like an omission, a minProperties: 1 would have fixed this I think.

The data secret name probably also wants some validation, a minLength, maxLength, a regex since we know what the valid name format of a secret in K8s is.

In the ClusterClass MachinePoolTemplate, the shape of the API is different and is just a local object reference, which is curious, why don't we allow the option of a template or data secret there, that one does seem a little odd, I don't have the context

JoelSpeed avatar Aug 27 '24 12:08 JoelSpeed

@sbueringer, Let's keep this discussion focused on MachinePool (validating API for MD) is tracked in the umbrella issue linked above.

@JoelSpeed thanks for all the hints!

I would expect the API to allow you to either specify a bootstrap template, or the data secret directly, and require that at least one of those be required. The data secret name probably also wants some validation...

Agreed, this is is the same point I'm making about the MP API, which todays lack of validation for data secret name. This allowed what IMO is a bug/a bad API design, that was exploited to address some legit use case.

But now that we are aware of this, and we are looking to graduate our APIs, I think that this should be properly addressed considering the whole story (e.g. the proposal about MPM).

In the ClusterClass MachinePoolTemplate, the shape of the API is different and is just a local object reference, which is curious, why don't we allow the option of a template or data secret there, that one does seem a little odd, I don't have the context

Based on my understanding, this is because it does not make sense to define a data secret with a cloud-init/ignition files that applies to all the clusters/MP originated by a CC. But this is is also where we need folks actually using and maintaining this feature to step in

fabriziopandini avatar Aug 29 '24 11:08 fabriziopandini

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 27 '24 11:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Dec 27 '24 12:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Apr 24 '25 14:04 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Apr 24 '25 14:04 k8s-ci-robot

cc @mboersma @richardcase I think this is related to the dataSecretName min length problem

sbueringer avatar May 12 '25 05:05 sbueringer

/reopen /remove-lifecycle rotten

sbueringer avatar Aug 11 '25 08:08 sbueringer

@sbueringer: Reopened this issue.

In response to this:

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Aug 11 '25 08:08 k8s-ci-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 09 '25 09:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Dec 09 '25 10:12 k8s-triage-robot

/remove-lifecycle rotten

richardcase avatar Dec 16 '25 14:12 richardcase