cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Add Karpenter support

Open Skarlso opened this issue 3 years ago • 20 comments

User Story

I would like to use Karpenter as auto-scaler.

Detailed Description

Karpenter provides a nice way of configuring autoscaling options. Right now, there is only support for AWS but more can / will be added later on with providers mechanism.

CAPI can support Karpenter being installed as an autoscaler.

https://karpenter.sh/

Anything else you would like to add:

Some talk has already been initiated about Karpetner supporting installation into CAPI/CAPA clusters here: https://github.com/aws/karpenter-core/issues/747

And another one was opened in CAPA, but was decided/suggested that this should be implemented in CAPI since its like the cluster-autoscaler here: https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3564

/kind feature

Skarlso avatar Sep 08 '22 19:09 Skarlso

@Skarlso: This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Sep 08 '22 19:09 k8s-ci-robot

cc @randomvariable @elmiko (just fyi)

sbueringer avatar Sep 09 '22 05:09 sbueringer

/triage needs-information

while I'm generally +1 to support tools in the ecosystem, I think we should qualify a little bit better what are the implication in cluster-API for this request; TBD if this requires a proposal

fabriziopandini avatar Sep 09 '22 09:09 fabriziopandini

👍 For sure. I'm not very well versed with CAPI so I'm unsure what the implications are and what they look like. I require some help in determining that. :)

Skarlso avatar Sep 09 '22 09:09 Skarlso

that's fair. I think we can start collecting what karpenter needs, then someone can step in to help in defining how this can be implemented in CAPI

fabriziopandini avatar Sep 09 '22 09:09 fabriziopandini

Awesome. Sounds good. I can add what little information I know. :) The attached Karpenter ticket details some of the things that Karpenter itself is working on to make this a viable thing. It might even be that we have to wait for them a bit to come up with something.

Skarlso avatar Sep 09 '22 09:09 Skarlso

thanks for the ping @sbueringer , definitely sounds like a good use case to me. from my understanding of karpenter i imagine there would be work to be done around synchronizing the instances created by it with the records that cluster-api keeps. perhaps this might be best used with MachinePools?

as i understand it, karpenter wants to talk directly to the AWS infrastructure. i had looked into a cluster-api provider for karpenter but had trouble integrating with their efforts at the time, i would imagine it's stabilized some since then. i have a feeling that deploying karpenter in a cluster-api cluster will look slightly different than with the cluster-autoscaler mainly due to the differences with how they both talk to the infrastructure.

with all that said, i have a feeling that deploying karpenter inside an AWS cluster would work, but i'm not sure what effect it would have on the MachineDeployments/Sets as those would get out of sync with its decision making. additionally, i believe karpenter wants to have access to a variety of instance sizes for it to create, this is where linking to a MachinePool style resource might make more sense.

happy to see this brought up =)

elmiko avatar Sep 09 '22 21:09 elmiko

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 28 '23 22:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 30 '23 22:03 k8s-triage-robot

/remove-lifecycle rotten

Skarlso avatar Mar 31 '23 17:03 Skarlso

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 29 '23 18:06 k8s-triage-robot

/lifecycle frozen

vincepri avatar Aug 23 '23 17:08 vincepri

we talked about this issue at the community meeting, i am going to see if there is interest in forming a feature group to investigate capi/karpenter integration. ideally, i will put a poll up in our slack channel sometime next week to see if we have enough interest to meet up.

elmiko avatar Aug 23 '23 17:08 elmiko

/priority backlog

fabriziopandini avatar Apr 12 '24 13:04 fabriziopandini

This issue is currently awaiting triage.

CAPI contributors will take a look as soon as possible, apply one of the triage/* labels and provide further guidance.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 12 '24 15:04 k8s-ci-robot

just to update here, the karpenter feature group has been meeting for several months now and we have agreed on a design and are working towards a proof-of-concept to demonstrate for the wider cluster-api community.

elmiko avatar Apr 12 '24 17:04 elmiko

@elmiko It would be great to share the design with the broader community to get a first quick round of feedback (it is also in the mission of the working groups to / important for their success to give periodic upgrade)

fabriziopandini avatar Apr 18 '24 15:04 fabriziopandini

@fabriziopandini absolutely. i think it's taken some time to land on what will work for us, i am preparing a slidedeck to present at an upcoming capi meeting to help further the discussion.

elmiko avatar Apr 18 '24 17:04 elmiko

For reference, there is a feature group meeting for this: https://hackmd.io/@elmiko/ryR2VXR0n

dtzar avatar May 09 '24 19:05 dtzar

Hi there, as cluster admin I'm interested to see this working. If any help is needed here, I would be happy to participate.

memorais avatar Sep 19 '24 18:09 memorais

@elmiko hey! is this project still alive?

dntosas avatar Nov 16 '24 09:11 dntosas

@dntosas yes it is!

we have contributed the initial version to the cncf, you can find it here https://github.com/kubernetes-sigs/karpenter-provider-cluster-api

and we have feature group meeting every other wednesday, see more here https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/community/20231018-karpenter-integration.md

perhaps we should close this issue, but i don't think we have fully landed on our final architecture. that said, the project works and we are now iterating on different architecture designs and addressing bugs.

elmiko avatar Nov 19 '24 18:11 elmiko

Oh that is absolutely fantastic. :)

Skarlso avatar Nov 19 '24 18:11 Skarlso

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Nov 19 '25 19:11 k8s-triage-robot

perhaps we should update the docs, but unless there is a further request here i think we should close this issue.

https://github.com/kubernetes-sigs/karpenter-provider-cluster-api has just released version 0.2.0 and is generally stable for testing by cluster-api users. it is still considered in an alpha state, but we welcome all contributions. =)

elmiko avatar Nov 19 '25 22:11 elmiko

Thanks very much @elmiko to drive this! I encourage everyone to take a look at https://github.com/kubernetes-sigs/karpenter-provider-cluster-api, test it, provide feedback!

/close

fabriziopandini avatar Nov 21 '25 20:11 fabriziopandini

@fabriziopandini: Closing this issue.

In response to this:

Thanks very much @elmiko to drive this! I encourage everyone to take a look at https://github.com/kubernetes-sigs/karpenter-provider-cluster-api, test it, provide feedback!

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Nov 21 '25 20:11 k8s-ci-robot