opentelemetry-demo
opentelemetry-demo copied to clipboard
`flagd` and `flagd-ui` memory limits too low
Bug Report
Which version of the demo you are using? v2.0.2
Symptom
Install the demo in k3s on a Multipass instance on Apple Silicon. The VM is configured with 4 CPUs, 8G RAM and 32Gb disk.
flagd is OOMKilled. Have tried both Helm chart and Kubernetes manifest.
What is the expected behavior?
flagd to start
What is the actual behavior?
flagd fails to start.
Reproduce
Use either the Helm chart or Kubernetes manifest from this repo.
Additional Context
By increasing the memory limit to 300Mi in the Kubernetes manifest for both flagd and flagd-ui the service starts with no issues.
Hitting this too. Thanks for the investigation and fix!
flagd-5ddb9d9b66-p5gkj 1/2 OOMKilled 4 (2m8s ago) 22h
I was too keen. Still OOMKilled / CrashLoopBackOff.
Need to also increase flagd-ui via Helm. In case it helps someone:
values.yaml (extract)
components:
flagd:
useDefault:
env: true
resources:
limits:
memory: "300Mi"
# flgad-ui as a sidecar container in the same pod so the flag json file can be shared
sidecarContainers:
- name: flagd-ui
useDefault:
env: true
resources:
limits:
memory: "300Mi"
Hey all, thx for reporting this! We already have a PR on the Helm repo to update this, hopefully we can have that merged soon.
I shared it also via a different channel, but I think it may be good if I also leave some comment here.
It looks like problems are not caused by insufficient amount of resource memory limit, but the fact that some language runtimes like Go (https://kupczynski.info/posts/go-container-aware/, https://www.ardanlabs.com/blog/2024/02/kubernetes-memory-limits-go.html), some Java VMs (https://cloud.theodo.com/en/blog/jvm-oom, https://factorhouse.io/blog/articles/corretto-memory-issues/) are not aware/respecting these resource limits. From my experience so far, if you add GOMEMLIMIT to 80% of the memory limit for all Go applications it will just work fine.
@julianocosta89, has the GOMEMLIMIT been set also to k8s setup? I see only a PR for docker.
You are right @pellared! Reopening it
Thanks for calling it out
I hope https://github.com/open-telemetry/opentelemetry-demo/pull/2564 does the trick, I tested it locally and all things are running as expected:
accounting-7579cb5968-2mhzp 1/1 Running 0 9h
ad-95cf4fdb9-psr47 1/1 Running 0 11h
cart-b66f96d74-7qrx4 1/1 Running 0 9h
checkout-5447765f4d-slmgf 1/1 Running 0 9h
currency-56cc958ccc-hfmh9 1/1 Running 0 10h
email-5b5b6c76dd-4x4b9 1/1 Running 0 11h
flagd-fbfc695db-tfxtp 2/2 Running 0 9h
fraud-detection-5fcbfcd6d7-7bccp 1/1 Running 0 9h
frontend-c85dd77b4-9mj57 1/1 Running 0 11h
frontend-proxy-7444c899fc-4d5j8 1/1 Running 0 11h
grafana-7ff76587d8-nlp7k 1/1 Running 0 11h
image-provider-6d6bbddfff-pw65n 1/1 Running 0 11h
jaeger-76b4cc75dd-25hwn 1/1 Running 0 11h
kafka-5b9c6db458-xxmjl 1/1 Running 0 9h
load-generator-6bf6549cdf-ttjt6 1/1 Running 0 11h
opensearch-0 1/1 Running 0 9h
otel-collector-7fb566966b-p298p 1/1 Running 0 11h
payment-5f44569979-b7cc8 1/1 Running 0 11h
product-catalog-6dc749d657-kxjq7 1/1 Running 0 11h
prometheus-59fc4944c7-fllvw 1/1 Running 5 (19m ago) 11h
quote-96c6d5d74-stkhn 1/1 Running 0 11h
recommendation-595c5c7df7-q5cf2 1/1 Running 0 11h
shipping-7c6bc949f7-6w2m9 1/1 Running 0 11h
valkey-cart-b9fcd6689-jttc2 1/1 Running 0 11h
Hello all, with the latest release that should be solved. Please update and redeploy the helm chart/k8s manifests.
Feel free to reopen the issue in case the error persists 😊