kustomize-controller kustomize-controller gets OOMKilled every hour

kustomize-controller gets OOMKilled every hour

Open bharathvrajan opened this issue 3 months ago • 3 comments

Background:

The kustomize-controller pod is getting OOMKilled every hour or so. Its reaches around ~7.65G and gets OOM Killed as the memory limit is 8G.

Image - ghcr.artifactory.gcp.anz/fluxcd/kustomize-controller:v1.2.2
There are 184 kustomizations in total
Concurrency is set to 20.

These are the flags enabled:

      containers:
      - args:
        - --events-addr=http://notification-controller.flux-system.svc.cluster.local./
        - --watch-all-namespaces=true
        - --log-level=info
        - --log-encoding=json
        - --enable-leader-election
        - --concurrent=20
        - --kube-api-qps=500
        - --kube-api-burst=1000
        - --requeue-dependency=15s
        - --no-remote-bases=true
        - --feature-gates=DisableStatusPollerCache=true

Requests & Limits:

        resources:
          limits:
            memory: 8Gi
          requests:
            cpu: "1"
            memory: 8Gi

What's been tried so far:

Added the flag --feature-gates=DisableStatusPollerCache=true to the kustomize-controller deployment, as mentioned in this issue - But this didn't make a difference, it still gets OOM killed in an hour.
Reduced the concurrency to 5 - At this point, the pod seems stable and memory consumption is around ~2.5G
Did a heap dump and the inuse_space is around ~22.64MB which is really less. Couldn't find anything useful there, but here's the link to the flamegraph. Also, here's the heap dump - heap.out.zip

Checked if we have a large repository that's loading unnecessary files as mentioned in this issue

This is from the source-controller:

~ $ du -sh /data/*
6.1M	     /data/gitrepository
824.0K     /data/helmchart
5.8M	    /data/helmrepository
16.0K	    /data/lost+found
48.0K	    /data/ocirepository

Want to understand what is causing the memory spike and OOM killings.

Mar 15 '24 01:03 bharathvrajan

kustomize-controller kustomize-controller copied to clipboard

kustomize-controller gets OOMKilled every hour

Background:

What's been tried so far:

kustomize-controller
kustomize-controller copied to clipboard