karpenter-provider-aws
karpenter-provider-aws copied to clipboard
Stuck with infinite loop of "waiting on cluster sync" and "abnormal time between runs"
Description
Observed Behavior: Our karpenter pod suddenly stopped provisioning any nodes and the only log shows it's stuck in the infinite loop. After I restart the pod, it would start processing for some time and enter the infinite loop again.
{"level":"DEBUG","time":"2024-08-04T19:32:56.983Z","logger":"controller","message":"waiting on cluster sync","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:56.983Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h11m58.850461658s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:57.983Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h11m59.850931899s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:58.092Z","logger":"controller","message":"waiting on cluster sync","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:58.092Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h11m59.959476145s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:59.092Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h12m0.960108358s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:59.160Z","logger":"controller","message":"waiting on cluster sync","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:59.160Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h12m1.027746864s","commit":"490ef94","controller":"disruption"}
Any ideas about the potential cause?
Expected Behavior:
Reproduction Steps (Please include YAML):
Versions:
- Chart Version: 0.37.0
- Kubernetes Version (
kubectl version
): 1.29
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment