flux2 icon indicating copy to clipboard operation
flux2 copied to clipboard

[kustomize-controller] is not restarting when source-controller and notification-controller are not reachable.

Open Cellebyte opened this issue 2 years ago • 2 comments

Describe the bug

The kustomize controller was not able to reach the source-controller and the notification-controller in its namespace.

I deployed a debug-pod to try out if the components were not reachable. But they were. From the debug-pod i was able to curl notification-controller and source-controller service. After manually deleting the kustomize-controller pod it comes back to life and works as normal. It smells like a caching problem or something else. I restarted all network components (calico-node and kube-proxy on the node running the kustomize-controller but that did not help.)

Steps to reproduce

Sadly N/A

Expected behavior

kustomize-controller should try to restart and recover when the other components are not reachable. Or should not be in a running/ready state.

Screenshots and recordings

No response

OS / Distro

Ubuntu 20.04

Flux version

0.17.2

Flux check

N/A

Git provider

N/A

Container Registry provider

N/A

Additional context

{"level":"error","ts":"2021-10-03T22:11:57.060Z","logger":"controller.kustomization","msg":"Reconciliation failed after 8m35.066664728s, next try in 5m0s","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"customer-components-default","namespace":"schiff-system","revision":"main/e22d0d6fe4ba8d9fe10a268b717ef86aed72e255","error":"failed to download artifact, error: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/customer-components/e22d0d6fe4ba8d9fe10a268b717ef86aed72e255.tar.gz giving up after 10 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/customer-components/e22d0d6fe4ba8d9fe10a268b717ef86aed72e255.tar.gz\": dial tcp 10.108.144.85:80: i/o timeout"}
{"level":"error","ts":"2021-10-03T22:12:13.392Z","logger":"controller.kustomization","msg":"Reconciliation failed after 8m35.087729155s, next try in 5m0s","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-default","namespace":"schiff-system","revision":"main/dc0e796582e5f33a6cc1b748e38238889941a8fc","error":"failed to download artifact, error: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/components/dc0e796582e5f33a6cc1b748e38238889941a8fc.tar.gz giving up after 10 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/schiff-system/components/dc0e796582e5f33a6cc1b748e38238889941a8fc.tar.gz\": dial tcp 10.108.144.85:80: i/o timeout"}
{"level":"error","ts":"2021-10-03T22:12:41.312Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-providers-vsphere-ref","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"error","ts":"2021-10-03T22:12:41.315Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-customers-wlan-ref","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"error","ts":"2021-10-03T22:12:42.067Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"customer-components-default","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"error","ts":"2021-10-03T22:12:58.399Z","logger":"controller.kustomization","msg":"unable to send event","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"components-default","namespace":"schiff-system","error":"POST http://notification-controller/ giving up after 5 attempt(s): Post \"http://notification-controller/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

Cellebyte avatar Oct 05 '21 08:10 Cellebyte