kops icon indicating copy to clipboard operation
kops copied to clipboard

Unable to configure disruption controls for karpenter

Open clayrisser opened this issue 1 year ago • 7 comments

I am unable to figure out how to add a disruption consolidationPolicy and expireAfter in my karpenter node pools for kops. Where do I configure this?

The karpenter docs discuss this here.

https://karpenter.sh/v0.32/concepts/nodepools/#specdisruption

I'm not even able to see a CRD for karpenter NodePools, so I'm guessing kops has another way of managing the disruption controls?

  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h # 30 * 24h = 720h

clayrisser avatar Feb 02 '24 10:02 clayrisser

From what I can tell right now, kOps installs karpenter version 0.31.3 by default which didn't support the nodePools concept yet, according to what I'm seeing in the docs (I hope I'm not wrong there), ref: https://github.com/kubernetes/kops/blob/d489024714013523bb1df74a58eaa9b99f6805b2/pkg/model/components/karpenter.go#L38-L40. This brings me to believe that it's not supported in kOps right now, and thus, we might need to put in some effort to add this.

I don't mind taking a stab at this one, wdyt @hakman @rifelpet @olemarkus ?

moshevayner avatar Feb 06 '24 04:02 moshevayner

I don't mind taking a stab at this one, wdyt @hakman @rifelpet @olemarkus ?

My impression is that, if we want to move Karpenter support to a newer version, we would need to move from providing the LaunchTemplates to doing everything via Karpenter objects.

https://github.com/kubernetes/kops/blob/d489024714013523bb1df74a58eaa9b99f6805b2/upup/models/cloudup/resources/addons/karpenter.sh/k8s-1.19.yaml.template#L1796-L1874

hakman avatar Feb 06 '24 04:02 hakman

My impression is that, if we want to move Karpenter support to a newer version, we would need to move from providing the LaunchTemplates to doing everything via Karpenter objects.

Yeah, that makes sense to me. So, would that be (theoretically) a somewhat similar process to any other cloudup add-on such as aws-cni, in which we'll update the template (and potentially supporting resources such as template functions etc.) according to the vendor chart?

moshevayner avatar Feb 06 '24 04:02 moshevayner

Yes. The good part is that we have a Karpenter e2e test, so should be easy to test via WIP PR.

hakman avatar Feb 06 '24 05:02 hakman

Sounds good! I'll give that a try. Thanks!

/assign

moshevayner avatar Feb 06 '24 05:02 moshevayner

From my understanding it's unlikely possible but doesn't hurt to ask if there is any workaround for getting upstream Karpenter to manage current kOps's release InstanceGroups?

teocns avatar Mar 16 '24 01:03 teocns

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 14 '24 02:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jul 14 '24 02:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Aug 13 '24 03:08 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Aug 13 '24 03:08 k8s-ci-robot