scylla-operator
scylla-operator copied to clipboard
Init container gets OOM killed on new cluster POD startup
What happened?
Created the local cluster using the k3s Created the operator Created the cluster, got OOM on Init container Edited the STS, increased the resource to 150Mi, cluster got created
Unfortunately the init container limits seem to be hard-coded so there is no way to influence the allocation.
What did you expect to happen?
no OOM, or at least able to change initContainer limits
How can we reproduce it (as minimally and precisely as possible)?
k3d cluster create --config $(pwd)/config.yaml
config.yaml
apiVersion: k3d.io/v1alpha4
kind: Simple
metadata:
name: edge-composition
servers: 1
agents: 3
image: rancher/k3s:v1.22.11-k3s1
kubeAPI: # same as `--api-port myhost.my.domain:6445` (where the name would resolve to 127.0.0.1)
host: "localhost"
hostIP: "127.0.0.1"
hostPort: "6447"
# expose ingress controller on local host port 8080
ports:
- port: 9980:80 # same as `--port '8080:80@loadbalancer'`
nodeFilters:
- loadbalancer
- port: 9943:443
nodeFilters:
- loadbalancer
options:
k3s:
extraArgs:
- arg: --no-deploy=traefik # do not deploy traefik ingress, we will use a different one
nodeFilters:
- server:*
nodeLabels:
- label: topology.kubernetes.io/zone=3 # same as `--k3s-node-label 'foo=bar@agent:1'` -> this results in a Kubernetes node label
nodeFilters:
- agent:2
- label: topology.kubernetes.io/zone=2 # same as `--k3s-node-label 'foo=bar@agent:1'` -> this results in a Kubernetes node label
nodeFilters:
- agent:0
- label: topology.kubernetes.io/zone=1 # same as `--k3s-node-label 'foo=bar@agent:1'` -> this results in a Kubernetes node label
nodeFilters:
- agent:1
kubectl apply -f https://raw.githubusercontent.com/scylladb/scylla-operator/master/deploy/operator.yaml
kubectl wait --for condition=established crd/scyllaclusters.scylla.scylladb.com
kubectl wait --for condition=established crd/nodeconfigs.scylla.scylladb.com
kubectl wait --for condition=established crd/scyllaoperatorconfigs.scylla.scylladb.com
kubectl -n scylla-operator rollout status deployment.apps/scylla-operator
kubectl -n scylla-operator rollout status deployment.apps/webhook-server
Create cluster
apiVersion: scylla.scylladb.com/v1
kind: ScyllaCluster
metadata:
name: temporal-cluster
namespace: temporal
spec:
version: 5.2.7
agentVersion: 3.1.2
repository: docker.io/scylladb/scylla
agentRepository: docker.io/scylladb/scylla-manager-agent
developerMode: true
cpuset: true
datacenter:
name: manager-dc
racks:
- agentResources:
requests:
cpu: 50m
memory: 80M
members: 1
name: zone1
resources:
limits:
cpu: 1
memory: 200Mi
requests:
cpu: 1
memory: 200Mi
storage:
capacity: 1Gi
# storageClassName: scylla-manager
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- "1"
- agentResources:
requests:
cpu: 50m
memory: 80M
members: 1
name: zone2
resources:
limits:
cpu: 1
memory: 200Mi
requests:
cpu: 1
memory: 200Mi
storage:
capacity: 1Gi
# storageClassName: scylla-manager
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- "2"
- agentResources:
requests:
cpu: 50m
memory: 80M
members: 1
name: zone3
resources:
limits:
cpu: 1
memory: 200Mi
requests:
cpu: 1
memory: 200Mi
storage:
capacity: 1Gi
# storageClassName: scylla-manager
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- "3"
Scylla Operator version
v1.12.0-alpha.0-102-geb68db4 also reproducible on v.1.11.0
Kubernetes platform name and version
reproduced on 1.21 & on 1.25
```console
$ kubectl version
# paste output here
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:40:17Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.15+k3s1", GitCommit:"d19260dc59280c5f5a3c6596c653e7cfdbb5f3c8", GitTreeState:"clean", BuildDate:"2023-10-30T21:44:53Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}
```
Kubernetes platform info:
Please attach the must-gather archive.
scylla-operator-must-gather-hdqcl4psgfqd.zip
Anything else we need to know?
No response