Reloader
Reloader copied to clipboard
Reloader frequent restarts due to failed liveness probes
I'm facing an issue with frequent restarts on v0.0.95.
Liveness endpoint (/metrics) response may sometimes be in the range of 1-5 seconds. Of course I can just increase the timeout (which is 1 second by default), but this only hides the problem. I believe there is some inefficiency in the code which affects even /metrics responses.
There are quite a lot of secrets and configmaps in the cluster, so it might put a strain, but there are no cpu and memory limits, so it should just take as much as needed and continue working. I think that /metrics should have its own thread or it would be even better to have a dedicated /readiness and /liveness endpoints which will really check and appropriately report the status of the service. Otherwise its unreliable to run in production especially considering there is no HA, so if pod is restarted I believe it will lose any info about the resources and may miss triggering reloads.
There are quite a lot of secrets and configmaps in the cluster
How many exactly?
How many exactly?
979 (configmaps + secrets)
although want to note that only some of the workloads have reloader annotation with only one secret referenced each
@faizanahmad055 can you take a look?
@rasheedamir sure will take a look.