Upgrade to Karpenter 0.32 and v1beta1
/kind feature
1. Describe IN DETAIL the feature/behavior/change you would like to see. Karpenter v0.32 was released with a v1beta1 that has significant changes from the v1alpha APIs.
There is a migration guide that covers the new CRDs.
2. Feel free to provide a design supporting your feature request.
It looks like all CRD API fields used in kops' template have a 1:1 translation to new fields. At the very least we'll need to enable pruning to cleanup the old custom resources. I'm not sure if pruning will work for both custom resources and their CRD.
The aws.enableENILimitedPodDensity that we currently set has been removed:
The
aws.enablePodENIwas dropped since Karpenter will now always assume thatvpc.amazonaws.com/pod-eniresource exists. Theaws.enableENILimitedPodDensitywas dropped since you can now override the--max-podsvalue for kubelet in thespec.kubelet.maxPodsfor NodeClaims or NodeClaimTemplates
Its not clear what that means exactly and whether kops becomes responsible for tracking the max pods for each instance type.
Pruning the old CRDs will be a challenge because normally we skip CRD pruning:
https://github.com/kubernetes/kops/blob/master/upup/pkg/fi/cloudup/bootstrapchannelbuilder/pruning.go
/kind office-hours
Decision from office hours:
- Upgrade to v0.31.3 (the last pre-beta version that can be rolled back to if the beta upgrade fails), cherrypick to kops 1.28
- Upgrade to latest karpenter, include both alpha and beta CRDs. Mention in kops 1.29 release notes for karpenter users to first upgrade to the kops 1.28 release that includes v0.31.3 before upgrading to kops 1.29
We canoptionally migrate our manifest template to use the new custom resources or delay this until a later kops release. Karpenter claims it supports both alpha and beta APIs for now but will drop alpha at some point in the future.
Kops depends on using externally-managed LaunchTemplates in Karpenter's AWSNodeTemplate (v1alpha1) or EC2NodeClass (v1beta1):
https://github.com/kubernetes/kops/blob/62e2d5ac7a979c365796f52801a61034fe1e9cbf/upup/models/cloudup/resources/addons/karpenter.sh/k8s-1.19.yaml.template#L1799-L1807
This allows kops to manage the LaunchTemplates based on instance group definitions and provide those to Karpenter.
Support for externally-managed LaunchTemplates has been removed so we'll need to decide how to proceed.
@olemarkus you had strong opinions about this originally, any ideas?
Has it actually been removed now? Sad since none of the limitations mentioned in the RFC applies to clusters maintained by installers such as kOps.
In order to support karpenter-managed launch templates we need to inject the kOps user data into the Karpenter CRs and then leave all of the node lifecycle up to Karpenter . I am guessing for example rolling updates need to ignore karpenter IGs and rather rely on karpenter's mechanisms for that.
Has it actually been removed now? Sad since none of the limitations mentioned in the RFC applies to clusters maintained by installers such as kOps.
The docs mention that only v0.32.X supports both apiVersions and CRDs:
Having different Kind names for v1alpha5 and v1beta1 allows them to coexist for the same Karpenter controller for v0.32.x.
All alpha references have been removed in v0.33.0: https://github.com/kubernetes-sigs/karpenter/pull/840
In order to support karpenter-managed launch templates we need to inject the kOps user data into the Karpenter CRs and then leave all of the node lifecycle up to Karpenter . I am guessing for example rolling updates need to ignore karpenter IGs and rather rely on karpenter's mechanisms for that.
That makes sense 👍🏻
Any news or ETA when this will be supported?
Do we already have a definition from kops maintainers if eventually it will support karpenter v0.33.0+? Or is still under discussion if it's even possible / worth it to handle all the necessary changes needed after karpenter dropped the unmanaged launch templates option?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten