helm-charts
helm-charts copied to clipboard
[grafana] Set GOMAXPROCS and GOMEMLIMIT environment variables based on container resources
Set GOMAXPROCS and GOMEMLIMIT environment variables based on container resources. This should reduce potential CPU throttling and OOMKills on containers.
The resourceFieldRef is a very specific Kubernetes directive that is created specifically for passing resource-related values, which rounds up the CPU value to the nearest whole number (e.g. 250m to 1) and passes the memory as a numeric value; so 64Mi would result in the environment variable being set to 67108864. This by design makes it completely compatible with Go's API.
An example is documented within Kubernetes documentation itself: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#use-container-fields-as-values-for-environment-variables.
Inspired by https://github.com/traefik/traefik-helm-chart/pull/1029.
Implementation notes:
- This was only added to the "main" Grafana container, given it's only relevant for Go-based applications
- This idea can probably be applied to other charts within this repository; I currently have no intention of following-up with this myself, but anyone coming across this, feel free to do so!
I would not recommend to set GOMAXPROCS like that. It's counterproductive.
GOMAXPROCS sets the maximum number of CPUs that can be executing simultaneously
However, if you give grafana cpu=2 on a 16 core machine, then the Grafana pod will still get 16 cores assigned, but the pod is only able to use 1/8 of each CPU core.
With GOMAXPROCS, grafana may only use 2 cores of them and would not longer benefit form all assigned resources.
See also a 7 years old discussion that promethues-operator: https://github.com/prometheus-operator/prometheus-operator/issues/501
I'm all fine with leaving GOMAXPROCS out, even though I still think it's beneficial from the things I've seen/read. From what I've seen, as soon as you actually use Kubernetes limits, you will experience some degree of throttling in practice and by setting GOMAXPROCS accordingly this is somewhat mitigated. The https://github.com/uber-go/automaxprocs is a widely-used tool that does a very similar thing, which was also added as opt-in to Prometheus operator.
However, I do agree all results and opinions I've seen are mixed and this is potentially not a better-for-everyone solution. If removing this part (or making it opt-in/opt-out rather than mandatory) is considered preferable, just let me know.
please bump the chart version, ty!
@zalegrala @Sheikh-Abubaker please take a look as well.
Kubernetes 1.33 introduced an in-place pod resizing feature which can make the limits dynamic, meaning the value reflected in the process's environment sourced from the env var at pod init might not reflect the actual running limit in this scenario. You could end up with a GOMEMLIMIT higher (or lower) than the actual container/cgroup limit. Just something to be aware of, and possibly make users aware of.
hi @jnoordsij
thank you for this really nice addition!
I'd just like to suggest a small change:
Since GOMEMLIMIT is just a "soft limit", it is expected that the total memory usage of the container will exceed that value.
So, by setting it to the value of limits.memory, OOMs are still likely to happen. By using the value of requests.memory instead, that risk could probably be reduced a lot, as long as you give some additional headroom with a higher limits.memory value.
This PR has been merged, so further suggestions to improve on this are slightly late. You are of course free to open a follow-up PR to improve upon the introduced behavior if you feel it is warranted. I've seen various suggestions for this elsewhere (see e.g. https://github.com/cert-manager/cert-manager/pull/6977#issuecomment-2180602404); maybe this can help you in arguing for your case.
As to whether or not using requests instead of limits should be considered an improvement, one may definitely argue. In particular because there's plenty of reason to argue these values should always be the same anyways (see e.g. https://home.robusta.dev/blog/kubernetes-memory-limit).
From my point of view, handling this inside a Kubernetes template or Helm chart is just a dirty hack. The program itself (for example Grafana) should be aware of container limits and configure itself accordingly, similar to how Prometheus is doing it. This should also solve the problem with in-place pod resizing.
The hacks at cert-manager are terrible and should be done by the program itself.
Vote here:
- https://github.com/grafana/grafana/issues/110733