source-controller
source-controller copied to clipboard
reconciliation for git source stuck with growing `workqueue_unfinished_work_seconds`
Hi, On my flux installation in AWS EKS, version
flux: v2.1.2
helm-controller: v0.36.2
image-automation-controller: v0.27.0
image-reflector-controller: v0.23.0
kustomize-controller: v1.1.1
notification-controller: v1.1.0
source-controller: v1.1.2
periodically I see source-controller being stuck and not completing/starting any reconciliation on git sources, while helm sources seems running fine.
I can't find anything in the source controller log during the time or before workqueue_unfinished_work_seconds starting to grow.
Any pointers on where to look to find a solution? Or what information might help to dig deeper
Do you see anything in the controller logs? Please use --log-level=debug. Please also check your consumption metrics. It would also help to have a git repository description.
I came across this issue researching about the "unfinished" metric. Although I am not using flux, my controller had a thread stuck because of https://github.com/kubernetes-sigs/controller-runtime/issues/2231 . For such, I did not have to, but I was recommended to use https://github.com/felixge/fgprof for detecting these kinds of scenario.