cryostat-operator icon indicating copy to clipboard operation
cryostat-operator copied to clipboard

[Bug] Operator controller manager Pod OOMKilled and crash looping

Open andrewazores opened this issue 1 year ago • 3 comments

Current Behavior

  containers:
    - resources:
        limits:
          cpu: '1'
          memory: 256Mi
        requests:
          cpu: 100m
          memory: 64Mi
....
lastState:
        terminated:
          exitCode: 137
          reason: OOMKilled
          startedAt: '2024-01-26T08:36:59Z'
          finishedAt: '2024-01-26T08:37:24Z'

operator-pod.log

Are the manager's resource limits hardcoded? Should this be increased by default and is there a way to make this configurable for the user?

https://github.com/cryostatio/cryostat-operator/blob/6ac7ec6aac49f3933b6314227e19b9d64a457b07/config/manager/manager.yaml#L59

Expected Behavior

No response

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

andrewazores avatar Jan 26 '24 13:01 andrewazores

https://sdk.operatorframework.io/docs/best-practices/managing-resources/#how-to-change-the-operatormanager-resources-values-when-under-olm-management

andrewazores avatar Jan 26 '24 13:01 andrewazores

Users can/should do this via the OLM Subscription object as documented above if they need to adjust the values rather than rely on our defaults.

Question still remains whether we should adjust the defaults.

andrewazores avatar Jan 26 '24 18:01 andrewazores

I think the subscription is the only way to do it at the bundle level. It's probably worth increasing the limit. Judging from the log, it didn't get very far. Maybe synchronizing the client cache on a cluster with many objects did it.

ebaron avatar Feb 12 '24 22:02 ebaron