opentelemetry-demo icon indicating copy to clipboard operation
opentelemetry-demo copied to clipboard

`flagd` and `flagd-ui` memory limits too low

Open rcastley opened this issue 8 months ago • 3 comments
trafficstars

Bug Report

Which version of the demo you are using? v2.0.2

Symptom

Install the demo in k3s on a Multipass instance on Apple Silicon. The VM is configured with 4 CPUs, 8G RAM and 32Gb disk.

flagd is OOMKilled. Have tried both Helm chart and Kubernetes manifest.

What is the expected behavior?

flagd to start

What is the actual behavior?

flagd fails to start.

Reproduce

Use either the Helm chart or Kubernetes manifest from this repo.

Additional Context

By increasing the memory limit to 300Mi in the Kubernetes manifest for both flagd and flagd-ui the service starts with no issues.

rcastley avatar Mar 19 '25 13:03 rcastley

Hitting this too. Thanks for the investigation and fix!

flagd-5ddb9d9b66-p5gkj 1/2 OOMKilled 4 (2m8s ago) 22h

I was too keen. Still OOMKilled / CrashLoopBackOff.

Need to also increase flagd-ui via Helm. In case it helps someone:

values.yaml (extract)

components:
  flagd:
    useDefault:
      env: true
    resources:
      limits:
        memory: "300Mi"
    # flgad-ui as a sidecar container in the same pod so the flag json file can be shared
    sidecarContainers:
      - name: flagd-ui
        useDefault:
          env: true
        resources:
          limits:
            memory: "300Mi"

agardnerIT avatar Mar 21 '25 04:03 agardnerIT

Hey all, thx for reporting this! We already have a PR on the Helm repo to update this, hopefully we can have that merged soon.

julianocosta89 avatar Mar 24 '25 07:03 julianocosta89

I shared it also via a different channel, but I think it may be good if I also leave some comment here.

It looks like problems are not caused by insufficient amount of resource memory limit, but the fact that some language runtimes like Go (https://kupczynski.info/posts/go-container-aware/, https://www.ardanlabs.com/blog/2024/02/kubernetes-memory-limits-go.html), some Java VMs (https://cloud.theodo.com/en/blog/jvm-oom, https://factorhouse.io/blog/articles/corretto-memory-issues/) are not aware/respecting these resource limits. From my experience so far, if you add GOMEMLIMIT to 80% of the memory limit for all Go applications it will just work fine.

pellared avatar Apr 11 '25 07:04 pellared

@julianocosta89, has the GOMEMLIMIT been set also to k8s setup? I see only a PR for docker.

pellared avatar Jun 26 '25 11:06 pellared

You are right @pellared! Reopening it

Thanks for calling it out

julianocosta89 avatar Jun 26 '25 11:06 julianocosta89

I hope https://github.com/open-telemetry/opentelemetry-demo/pull/2564 does the trick, I tested it locally and all things are running as expected:

accounting-7579cb5968-2mhzp        1/1     Running   0             9h
ad-95cf4fdb9-psr47                 1/1     Running   0             11h
cart-b66f96d74-7qrx4               1/1     Running   0             9h
checkout-5447765f4d-slmgf          1/1     Running   0             9h
currency-56cc958ccc-hfmh9          1/1     Running   0             10h
email-5b5b6c76dd-4x4b9             1/1     Running   0             11h
flagd-fbfc695db-tfxtp              2/2     Running   0             9h
fraud-detection-5fcbfcd6d7-7bccp   1/1     Running   0             9h
frontend-c85dd77b4-9mj57           1/1     Running   0             11h
frontend-proxy-7444c899fc-4d5j8    1/1     Running   0             11h
grafana-7ff76587d8-nlp7k           1/1     Running   0             11h
image-provider-6d6bbddfff-pw65n    1/1     Running   0             11h
jaeger-76b4cc75dd-25hwn            1/1     Running   0             11h
kafka-5b9c6db458-xxmjl             1/1     Running   0             9h
load-generator-6bf6549cdf-ttjt6    1/1     Running   0             11h
opensearch-0                       1/1     Running   0             9h
otel-collector-7fb566966b-p298p    1/1     Running   0             11h
payment-5f44569979-b7cc8           1/1     Running   0             11h
product-catalog-6dc749d657-kxjq7   1/1     Running   0             11h
prometheus-59fc4944c7-fllvw        1/1     Running   5 (19m ago)   11h
quote-96c6d5d74-stkhn              1/1     Running   0             11h
recommendation-595c5c7df7-q5cf2    1/1     Running   0             11h
shipping-7c6bc949f7-6w2m9          1/1     Running   0             11h
valkey-cart-b9fcd6689-jttc2        1/1     Running   0             11h

svrnm avatar Sep 19 '25 05:09 svrnm

Hello all, with the latest release that should be solved. Please update and redeploy the helm chart/k8s manifests.

Feel free to reopen the issue in case the error persists 😊

julianocosta89 avatar Oct 22 '25 07:10 julianocosta89