karpenter Publishing failed pod schedule events can lead to etcd overflow

We are having multiple NodePools in our system for better resource isolation and different requirements and do processing with large amount of kubernetes Jobs. Scheduling multiple Pods will trigger scaling up of different NodePools. Karpenter will iterate over all existing NodePools to find correct one. But for each incompatible NodePool it will emit error message:

https://github.com/kubernetes-sigs/karpenter/blob/7bf31e553f390111058d16b6cd5745ed144d3de8/pkg/controllers/provisioning/scheduling/scheduler.go#L405

This message in turn will be emitted as k8s Event in:

https://github.com/kubernetes-sigs/karpenter/blob/7bf31e553f390111058d16b6cd5745ed144d3de8/pkg/events/recorder.go#L86

With multiple NodePools, each message will be in order of 10-20 kilobytes. And having ~10k pods will overflow etcd database.

Mar 13 '25 03:03 bacek

Correction: in our case events were ~76KB.

Mar 13 '25 03:03 bacek

And having ~10k pods will overflow etcd database

How does 76Kb overflow the DB? That seems like a pretty small number to me. Also, containerd and kube-scheduler itself emits more events generally than Karpenter does so I'm curious if you are seeing the same issue with that component.

Mar 13 '25 16:03 jonathan-innis

And having ~10k pods will overflow etcd database

How does 76Kb overflow the DB? That seems like a pretty small number to me. Also, containerd and kube-scheduler itself emits more events generally than Karpenter does so I'm curious if you are seeing the same issue with that component.

We don't have enough compute to provision fall all pods. So we have some long running jobs while we utilize all available nodes.

So state of etcd overflow we had this:

kubectl get --raw=/metrics | grep apiserver_storage_objects |awk '$2>100' |sort -g -k 2
# HELP apiserver_storage_objects [STABLE] Number of stored objects at the time of last check split by kind. In case of a fetching error, the value will be -1.
# TYPE apiserver_storage_objects gauge
apiserver_storage_objects{resource="virtualservices.networking.istio.io"} 110
apiserver_storage_objects{resource="roles.rbac.authorization.k8s.io"} 112
apiserver_storage_objects{resource="deployments.apps"} 126
apiserver_storage_objects{resource="rolebindings.rbac.authorization.k8s.io"} 126
apiserver_storage_objects{resource="controllerrevisions.apps"} 130
apiserver_storage_objects{resource="clusterrolebindings.rbac.authorization.k8s.io"} 148
apiserver_storage_objects{resource="clusterroles.rbac.authorization.k8s.io"} 174
apiserver_storage_objects{resource="endpoints"} 188
apiserver_storage_objects{resource="endpointslices.discovery.k8s.io"} 202
apiserver_storage_objects{resource="serviceaccounts"} 205
apiserver_storage_objects{resource="services"} 229
apiserver_storage_objects{resource="configmaps"} 269
apiserver_storage_objects{resource="nodeclaims.karpenter.sh"} 278
apiserver_storage_objects{resource="cninodes.vpcresources.k8s.aws"} 280
apiserver_storage_objects{resource="csinodes.storage.k8s.io"} 280
apiserver_storage_objects{resource="nodes"} 280
apiserver_storage_objects{resource="leases.coordination.k8s.io"} 337
apiserver_storage_objects{resource="certificatesigningrequests.certificates.k8s.io"} 343
apiserver_storage_objects{resource="secrets"} 391
apiserver_storage_objects{resource="replicasets.apps"} 547
apiserver_storage_objects{resource="jobs.batch"} 1744
apiserver_storage_objects{resource="pods"} 4596
apiserver_storage_objects{resource="events"} 955190

So, 4.5k pods generated almost 1M events. Which translates to 76GB worth of events.

Mar 16 '25 23:03 bacek

/triage needs-investigation /priority needs-more-evidence

Mar 17 '25 16:03 jmdeal

@jmdeal: The label(s) priority/needs-more-evidence cannot be applied, because the repository doesn't have them.

In response to this:

/triage needs-investigation /priority needs-more-evidence

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Mar 17 '25 16:03 k8s-ci-robot

Is this on a self-managed cluster?

Mar 26 '25 02:03 rschalo

Is this on a self-managed cluster?

No, it's EKS.

Mar 26 '25 02:03 bacek

Is there any news about this? Our solution has been to build a fork limiting the msg size (and therefore making it useless). We keep reaching the etcd limit, and this karpenter msg is the main culprit.

Apr 02 '25 23:04 JonCholas

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jul 02 '25 00:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Aug 01 '25 00:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Aug 31 '25 00:08 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Aug 31 '25 00:08 k8s-ci-robot