sealed-secrets icon indicating copy to clipboard operation
sealed-secrets copied to clipboard

Controller restart loop on first install of the Helm chart

Open svonliebenstein opened this issue 3 years ago • 6 comments

Which component: Controller / Helm Chart Deployment.yaml

Is your feature request related to a problem? Please describe. On our latest deployments on several private AKS 1.21 clusters, all initial deployments fail. This is caused by the way the liveness probe is setup. The liveness probe checks the /healthz endpoint, but this is only available when the HTTP server has started. Before HTTP server startup, the controller is checking if a private key secret already exists. On first install this is never the case and it will generate a private key.

For some reason this process of checking the secret & generating a private key takes more time than what the default livenessprobe configuration allows. As the livenessprobe will trigger restarts before the key is created, the controller is never able to start and stay alive.

Describe the solution you'd like A startup probe can be used for mitigating this issue. Or some more control over the configuration of the liveness probe itself, it uses kubernetes defaults with no option to change the values.

Describe alternatives you've considered

Additional context

svonliebenstein avatar Dec 22 '21 13:12 svonliebenstein

Hi @svonliebenstein

Would it help if we parametrized the probes? I mean, if we expose some parameters in the chart that allow you modifying the initialDelaySeconds for liveness and readiness probes.

A startup probe can be used for mitigating this issue.

We can add support for this although disabled by default since older K8s distros don't have support for this kind of probes.

juan131 avatar Dec 24 '21 14:12 juan131

Hi @svonliebenstein

Would it help if we parametrized the probes? I mean, if we expose some parameters in the chart that allow you modifying the initialDelaySeconds for liveness and readiness probes.

A startup probe can be used for mitigating this issue.

We can add support for this although disabled by default since older K8s distros don't have support for this kind of probes.

Hi, sorry for the late reply. I think both options would be nice for our use case.

svonliebenstein avatar Dec 29 '21 08:12 svonliebenstein

So... I found the root cause of our issue @juan131 . Our cpu limit was set too low, which caused the slow key generation.

As this fixes our problem, I'll close this issue.

svonliebenstein avatar Dec 30 '21 08:12 svonliebenstein

Great @svonliebenstein ! I'm glad you were able to solve it!

juan131 avatar Jan 13 '22 07:01 juan131

I think this should be reopened. I have the cpu limit at 2000m and it is still having a hard time starting up within the readiness probe time limit.

I propose that some default resources values are provided in the values.yaml file, and then commented out. This seems to be the norm in other helm files.

ngbrown avatar Mar 03 '22 05:03 ngbrown

Hi @ngbrown

You can customize the resources requests and limits easily setting the parameters below:

  • https://github.com/bitnami-labs/sealed-secrets/blob/main/helm/sealed-secrets/values.yaml#L72
  • https://github.com/bitnami-labs/sealed-secrets/blob/main/helm/sealed-secrets/values.yaml#L73

I also created a PR to make the probes (readiness/liveness/startup) customizable, see:

  • https://github.com/bitnami-labs/sealed-secrets/pull/764

juan131 avatar Mar 03 '22 09:03 juan131