Implement support for kubeadm v1beta4 API
What would you like to be added (User Story)?
As a user, I want to be able to create clusters with kubeadm 1.31 (which most probably is going to use v1beta4 API) As a user, I want to be able to use latest and greatest introduced by kubeadm v1beta4 API
Detailed Description
Changes introduced by kubeadm v1beta4 that we might add to CABPK without breaking changes (caveats, those changes apply only to clusters with K8s >= 1.31, for older cluster they are no-op)
- ClusterConfiguration.Proxy.Disabled (note, this might has a correlation with the controlplane.cluster.x-k8s.io/skip-coredns annotation)
- ClusterConfiguration.DNS.Disabled (note, this might has a correlation with the controlplane.cluster.x-k8s.io/skip-kube-proxy annotation)
- ClusterConfiguration.EncryptionAlgorithm (note, exposing this flag might imply other changes in Cluster API certificate management)
- ClusterConfiguration.CertificateValidityPeriod (note, exposing this flag might imply other changes in Cluster API certificate management)
- ClusterConfiguration.CACertificateValidityPeriod (note, exposing this flag might imply other changes in Cluster API certificate management)
- ClusterConfiguration.*.ExtraEnvs
- Init/JoinConfiguration.NodeRegistrationOptions.ImagePullSerial
- Init/JoinConfiguration.Timeouts. Note:
- ClusterConfiguration.TimeoutForControlPlane is now Init/JoinConfiguration.Timeout.ControlPlaneComponentHealthCheck
- JoinConfiguration.Discovery.Timeout is now JoinConfiguration.Timeout.TLSBootstrap
Changes introduced by kubeadm v1beta4 that require CABPK breaking changes to be implemented
- ClusterConfiguration.*.ExtraArgs allowing to set multiple values for the same key
- Init/JoinConfiguration.NodeRegistrationOptions.KubeletExtraArgs allowing to set multiple values for the same key
Changes introduced by kubeadm v1beta4 that are not relevant to CABPK
- Init/JoinConfiguration.DryRun (dry run makes sense only when using kubeadm from the CLI in interactive mode)
- ResetConfiguration, UpgradeConfiguration (we are not using this commands in CABPK)
Anything else you would like to add?
rif https://github.com/kubernetes/kubernetes/pull/125029
Action Plan
Mandatory tasks to support Kubernetes v1.31:
- [x] Implements conversions from CAPI v1beta1 types to kubeadm v1beta4 https://github.com/kubernetes-sigs/cluster-api/pull/10709
- Special handling should be implemented for ClusterConfiguration.TimeoutForControlPlane and JoinConfiguration.Discovery.Timeout
Optional non breaking changes to be implemented ASAP:
- [x] Before adding new fields, check potential impacts on things like https://github.com/kubernetes-sigs/cluster-api/blob/57dc2317bea6dea7cbc82535f8180afa518b7fcd/controlplane/kubeadm/internal/filters.go#L220, also ClusterClass and topology reconcile https://github.com/kubernetes-sigs/cluster-api/pull/10846
- [x] Add ClusterConfiguration.*.ExtraEnvs https://github.com/kubernetes-sigs/cluster-api/pull/10846
- [x] Add Init/JoinConfiguration.NodeRegistrationOptions.ImagePullSerial https://github.com/kubernetes-sigs/cluster-api/pull/10846
- [ ] Add Init/JoinConfiguration.Timeout
- Important: Timeout.ControlPlaneComponentHealthCheck and Timeout.TLSBootstrap must not be added now to ensure a clean migration of ClusterConfiguration.TimeoutForControlPlane and JoinConfiguration.Discovery.Timeout when we introduce CAPI v1beta2 types
Changes deferred to when we review certificate management / renewal
- [ ] Add ClusterConfiguration.CertificateValidityPeriod and ClusterConfiguration.CACertificateValidityPeriod
Changes deferred to when we review kubeadm/KCP addon management
- [ ] Add ClusterConfiguration.Proxy.Disabled and ClusterConfiguration.DNS.Disabled
Changes deferred to when we implement https://github.com/kubernetes-sigs/cluster-api/issues/10077
- [ ] Add ClusterConfiguration.EncryptionAlgorithm
Changes deferred to when we implement CAPI v1beta2 types
- [ ] Refactor ClusterConfiguration.*.ExtraArgs and Init/JoinConfiguration.NodeRegistrationOptions.KubeletExtraArgs
- [ ] Add Timeout.ControlPlaneComponentHealthCheck and Timeout.TLSBootstrap and remove ClusterConfiguration.TimeoutForControlPlane and JoinConfiguration.Discovery.Timeout
Label(s) to be applied
/kind feature
/priority important-soon note: priority assumes we can continue to work with v1beta3 API, but if this is not true it must be bumped to critical-urgent
/reopen
@sbueringer: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Add Init/JoinConfiguration.Timeout
Is it possible to work on the above task now?
Our cluster has a large etcd db, which causes the learner to fail to promote to follower, so we have disabled kubeadm EtcdLearnerMode. However, this option was removed in v1.33 with GA. Therefore, we would like to use timeouts.etcdAPICall.
- https://github.com/kubernetes/kubernetes/blob/v1.32.5/cmd/kubeadm/app/util/etcd/etcd.go#L574
@fabriziopandini is just now taking a closer look at this issue and at least implement some part of it. (I let him respond if what you are asking for is covered there or not)
https://github.com/kubernetes-sigs/cluster-api/pull/12282 implements changes for extra args, timeouts and image pull polices
Other changes with broader impacts on other CAPI features are tracked in separated issues:
Certificate management / renewal --> https://github.com/kubernetes-sigs/cluster-api/issues/12289
Proxy and DNS installation and management --> https://github.com/kubernetes-sigs/cluster-api/issues/12288
Support more EncryptionAlgorithm for certificates -> https://github.com/kubernetes-sigs/cluster-api/issues/10077