Karpenter going into crashloopback after eks upgraded from 1.32 to 1.33
hi Team,
We observed an issue where we started observing karpenter 1.8.1 pods going to crashloopback. The pods describe events show only the liveness and readiness probe failing.
Changes EKS upgraded 1.32 to 1.33 karpenter pods went to crashloop on 1.8.1
No logs were showing any information
Hi @sivamalla42, I can help dig into this.
Could you please provide a few more details to help diagnose the crash?
-
Exact Versions
-
Pod Details & Events Looking for probe failures or volume mount errors in the events.
-
Previous Logs Since the current logs might be empty if it crashes immediately, checking the previous instance is critical:
Hi @AbhinavPInamdar ,
Thanks for your response, please find the below details
Exact Versions:
helm list -n karpenter NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION karpenter karpenter 4 2025-12-15 17:09:55.809541 +0530 IST failed karpenter-1.8.3 1.8.3
karpenter-crd karpenter 3 2025-12-15 16:29:24.42723 +0530 IST deployed karpenter-crd-1.8.3 1.8.3
eks version: v1.33.5-eks-ecaa3a6
Pod Details & Events Looking for probe failures or volume mount errors in the events.
Events: Type Reason Age From Message
Warning Unhealthy 29m (x66 over 3h16m) kubelet (combined from similar events): Readiness probe failed: Get "http://XXXXX:8081/readyz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Warning Unhealthy 19m (x129 over 3h29m) kubelet Liveness probe failed: Get "http://XXXX:8081/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Normal Killing 9m27s (x47 over 3h28m) kubelet Container controller failed liveness probe, will be restarted Warning Unhealthy 8m46s (x115 over 3h29m) kubelet Readiness probe failed: Get "http://XXXXXX:8081/readyz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Warning BackOff 4m25s (x518 over 3h16m) kubelet Back-off restarting failed container controller in pod karpenter-f874cbf47-kgwzx_karpenter(fc8121e5-6b94-4270-b756-e8fc02716d85) Normal Pulled 2m15s (x49 over 3h30m) kubelet Container image "public.ecr.aws/karpenter/controller:1.8.3@sha256:2814fc3dfd440118fae8c75426d469f7015d77353c8b4b1c2fe6398838fd6627" already present on machine Normal Started 7s (x50 over 3h30m) kubelet Started container controller
Previous Logs Since the current logs might be empty if it crashes immediately, checking the previous instance is critical: unfortunately no logs available yet
To add to this background. karpenter was installed successfully with 1.8.1 on eks 1.32 and everything was working as expected. but once the eks 1.32 is upgraded to 1.33, we observe the pods going to crashloopbackoff state