karpenter-provider-aws icon indicating copy to clipboard operation
karpenter-provider-aws copied to clipboard

Stuck with infinite loop of "waiting on cluster sync" and "abnormal time between runs"

Open WxFang opened this issue 6 months ago • 6 comments

Description

Observed Behavior: Our karpenter pod suddenly stopped provisioning any nodes and the only log shows it's stuck in the infinite loop. After I restart the pod, it would start processing for some time and enter the infinite loop again.

{"level":"DEBUG","time":"2024-08-04T19:32:56.983Z","logger":"controller","message":"waiting on cluster sync","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:56.983Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h11m58.850461658s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:57.983Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h11m59.850931899s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:58.092Z","logger":"controller","message":"waiting on cluster sync","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:58.092Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h11m59.959476145s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:59.092Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h12m0.960108358s","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:59.160Z","logger":"controller","message":"waiting on cluster sync","commit":"490ef94","controller":"disruption"}
{"level":"DEBUG","time":"2024-08-04T19:32:59.160Z","logger":"controller","message":"abnormal time between runs of *disruption.Expiration = 42h12m1.027746864s","commit":"490ef94","controller":"disruption"}

Any ideas about the potential cause?

Expected Behavior:

Reproduction Steps (Please include YAML):

Versions:

  • Chart Version: 0.37.0
  • Kubernetes Version (kubectl version): 1.29
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

WxFang avatar Aug 04 '24 20:08 WxFang