Add Karpenter support
User Story
I would like to use Karpenter as auto-scaler.
Detailed Description
Karpenter provides a nice way of configuring autoscaling options. Right now, there is only support for AWS but more can / will be added later on with providers mechanism.
CAPI can support Karpenter being installed as an autoscaler.
https://karpenter.sh/
Anything else you would like to add:
Some talk has already been initiated about Karpetner supporting installation into CAPI/CAPA clusters here: https://github.com/aws/karpenter-core/issues/747
And another one was opened in CAPA, but was decided/suggested that this should be implemented in CAPI since its like the cluster-autoscaler here: https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3564
/kind feature
@Skarlso: This issue is currently awaiting triage.
If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.
The triage/accepted label can be added by org members by writing /triage accepted in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
cc @randomvariable @elmiko (just fyi)
/triage needs-information
while I'm generally +1 to support tools in the ecosystem, I think we should qualify a little bit better what are the implication in cluster-API for this request; TBD if this requires a proposal
👍 For sure. I'm not very well versed with CAPI so I'm unsure what the implications are and what they look like. I require some help in determining that. :)
that's fair. I think we can start collecting what karpenter needs, then someone can step in to help in defining how this can be implemented in CAPI
Awesome. Sounds good. I can add what little information I know. :) The attached Karpenter ticket details some of the things that Karpenter itself is working on to make this a viable thing. It might even be that we have to wait for them a bit to come up with something.
thanks for the ping @sbueringer , definitely sounds like a good use case to me. from my understanding of karpenter i imagine there would be work to be done around synchronizing the instances created by it with the records that cluster-api keeps. perhaps this might be best used with MachinePools?
as i understand it, karpenter wants to talk directly to the AWS infrastructure. i had looked into a cluster-api provider for karpenter but had trouble integrating with their efforts at the time, i would imagine it's stabilized some since then. i have a feeling that deploying karpenter in a cluster-api cluster will look slightly different than with the cluster-autoscaler mainly due to the differences with how they both talk to the infrastructure.
with all that said, i have a feeling that deploying karpenter inside an AWS cluster would work, but i'm not sure what effect it would have on the MachineDeployments/Sets as those would get out of sync with its decision making. additionally, i believe karpenter wants to have access to a variety of instance sizes for it to create, this is where linking to a MachinePool style resource might make more sense.
happy to see this brought up =)
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen
we talked about this issue at the community meeting, i am going to see if there is interest in forming a feature group to investigate capi/karpenter integration. ideally, i will put a poll up in our slack channel sometime next week to see if we have enough interest to meet up.
/priority backlog
This issue is currently awaiting triage.
CAPI contributors will take a look as soon as possible, apply one of the triage/* labels and provide further guidance.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
just to update here, the karpenter feature group has been meeting for several months now and we have agreed on a design and are working towards a proof-of-concept to demonstrate for the wider cluster-api community.
@elmiko It would be great to share the design with the broader community to get a first quick round of feedback (it is also in the mission of the working groups to / important for their success to give periodic upgrade)
@fabriziopandini absolutely. i think it's taken some time to land on what will work for us, i am preparing a slidedeck to present at an upcoming capi meeting to help further the discussion.
For reference, there is a feature group meeting for this: https://hackmd.io/@elmiko/ryR2VXR0n
Hi there, as cluster admin I'm interested to see this working. If any help is needed here, I would be happy to participate.
@elmiko hey! is this project still alive?
@dntosas yes it is!
we have contributed the initial version to the cncf, you can find it here https://github.com/kubernetes-sigs/karpenter-provider-cluster-api
and we have feature group meeting every other wednesday, see more here https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/community/20231018-karpenter-integration.md
perhaps we should close this issue, but i don't think we have fully landed on our final architecture. that said, the project works and we are now iterating on different architecture designs and addressing bugs.
Oh that is absolutely fantastic. :)
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
perhaps we should update the docs, but unless there is a further request here i think we should close this issue.
https://github.com/kubernetes-sigs/karpenter-provider-cluster-api has just released version 0.2.0 and is generally stable for testing by cluster-api users. it is still considered in an alpha state, but we welcome all contributions. =)
Thanks very much @elmiko to drive this! I encourage everyone to take a look at https://github.com/kubernetes-sigs/karpenter-provider-cluster-api, test it, provide feedback!
/close
@fabriziopandini: Closing this issue.
In response to this:
Thanks very much @elmiko to drive this! I encourage everyone to take a look at https://github.com/kubernetes-sigs/karpenter-provider-cluster-api, test it, provide feedback!
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.