flux2
flux2 copied to clipboard
[kustomize-controller] is not restarting when source-controller and notification-controller are not reachable.
Describe the bug
The kustomize controller was not able to reach the source-controller and the notification-controller in its namespace.
I deployed a debug-pod to try out if the components were not reachable. But they were. From the debug-pod i was able to curl notification-controller and source-controller service. After manually deleting the kustomize-controller pod it comes back to life and works as normal. It smells like a caching problem or something else. I restarted all network components (calico-node and kube-proxy on the node running the kustomize-controller but that did not help.)
Steps to reproduce
Sadly N/A
Expected behavior
kustomize-controller should try to restart and recover when the other components are not reachable. Or should not be in a running/ready state.
Screenshots and recordings
No response
OS / Distro
Ubuntu 20.04
Flux version
0.17.2
Flux check
N/A
Git provider
N/A
Container Registry provider
N/A
Additional context
{"level":"error","ts":"2021-10-03T22:11:57.060Z","logger":"controller.kustomization","msg":"Reconciliation failed after 8m35.066664728s, next try in 5m0s","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"customer-components-default","namespace":"schiff-system","revision":"main/e22d0d6fe4ba8d9fe10a268b717ef86aed72e255","error":"failed to download artifact, error: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/customer-components/e22d0d6fe4ba8d9fe10a268b717ef86aed72e255.tar.gz giving up after 10 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/customer-components/e22d0d6fe4ba8d9fe10a268b717ef86aed72e255.tar.gz\": dial tcp 10.108.144.85:80: i/o timeout"}
{"level":"error","ts":"2021-10-03T22:12:13.392Z","logger":"controller.kustomization","msg":"Reconciliation failed after 8m35.087729155s, next try in 5m0s","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-default","namespace":"schiff-system","revision":"main/dc0e796582e5f33a6cc1b748e38238889941a8fc","error":"failed to download artifact, error: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/components/dc0e796582e5f33a6cc1b748e38238889941a8fc.tar.gz giving up after 10 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/components/dc0e796582e5f33a6cc1b748e38238889941a8fc.tar.gz\": dial tcp 10.108.144.85:80: i/o timeout"}
{"level":"error","ts":"2021-10-03T22:12:41.312Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-providers-vsphere-ref","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"error","ts":"2021-10-03T22:12:41.315Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-customers-wlan-ref","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"error","ts":"2021-10-03T22:12:42.067Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"customer-components-default","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"error","ts":"2021-10-03T22:12:58.399Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-default","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
Code of Conduct
- [X] I agree to follow this project's Code of Conduct