K8SSAND-954 ⁃ Unable to create CassandraDatacenter if Setup containers.securityContext.readOnlyRootFilesystem: true
What happened? I tried to create a CassandraDatacenter with the containers.securityContext.readOnlyRootFilesystem: true, but the pod is always in the CrashLoopBackOff status.
The pods are running normally if I change the containers.securityContext.readOnlyRootFilesystem: false
The yaml
# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
name: dc21
spec:
nodeAffinityLabels:
beta.kubernetes.io/arch: amd64
clusterName: cluster2
serverType: dse
serverVersion: "6.8.14"
systemLoggerImage:
serverImage:
configBuilderImage:
managementApiAuth:
insecure: {}
size: 1
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 1
memory: 4Gi
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: nfs-client
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
dockerImageRunsAsCassandra: false
podTemplateSpec:
spec:
initContainers:
- name: server-config-init
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
containers:
- name: "cassandra"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
hostIPC: false
hostNetwork: false
hostPID: false
securityContext:
runAsNonRoot: true
config:
jvm-server-options:
initial_heap_size: "800M"
max_heap_size: "800M"
additional-jvm-opts:
# As the database comes up for the first time, set system keyspaces to RF=3
- "-Ddse.system_distributed_replication_dc_names=dc21"
- "-Ddse.system_distributed_replication_per_dc=3"
The pod status
MacBook-Pro-3:db zhiminsun$ oc get pod
NAME READY STATUS RESTARTS AGE
cluster2-dc21-default-sts-0 1/2 CrashLoopBackOff 213 17h
The pod Events error
Events:
Warning BackOff 62s (x7 over 103s) kubelet, worker2.zhim.cp.fyre.ibm.com Back-off restarting failed container
Did you expect to see something different? I expect that containers.securityContext.readOnlyRootFilesystem: true
┆Issue is synchronized with this Jira Task by Unito ┆Reviewer: Michael Burman ┆friendlyId: K8SSAND-954 ┆priority: Medium
Hi @zhimsun
What version of cass-operator are you using?
The pods are running normally if I change the containers.securityContext.readOnlyRootFilesystem: false
Did you make this change for all containers? If not, which one(s)?
I am trying to test and produce with CodeReady Containers, but cass-operator is crashing. Looks like it is happening during initialization. I'll try some more.
I tested against my local kind cluster with a slightly modified manifest. Here is mine:
# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
name: dc21
spec:
# nodeAffinityLabels:
# beta.kubernetes.io/arch: amd64
clusterName: cluster2
serverType: dse
serverVersion: "6.8.14"
systemLoggerImage:
serverImage:
configBuilderImage:
managementApiAuth:
insecure: {}
size: 1
# resources:
# requests:
# cpu: 1
# memory: 4Gi
# limits:
# cpu: 1
# memory: 4Gi
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
dockerImageRunsAsCassandra: false
podTemplateSpec:
spec:
initContainers:
- name: server-config-init
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
containers:
- name: "cassandra"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
hostIPC: false
hostNetwork: false
hostPID: false
securityContext:
runAsNonRoot: true
runAsUser: 65533
runAsGroup: 65533
fsGroup: 65533
config:
jvm-server-options:
initial_heap_size: "800M"
max_heap_size: "800M"
additional-jvm-opts:
# As the database comes up for the first time, set system keyspaces to RF=3
- "-Ddse.system_distributed_replication_dc_names=dc21"
- "-Ddse.system_distributed_replication_per_dc=3"
I had to update securityContext. Without setting the user and group the pod was failing to initialize with this error:
state:
waiting:
message: 'container has runAsNonRoot and image has non-numeric user (cassandra),
cannot verify user is non-root (pod: "cluster2-dc21-default-sts-0_cass-operator(7a5fc807-2b54-4751-9c08-497470fa0ef1)",
container: server-config-init)'
reason: CreateContainerConfigError
I deleted my CassandraDatacenter and changed the securityContext and now I do end up with a CrashLoopBackOff due to the cassandra container. Here is the error in the logs:
ln: failed to create symbolic link '/opt/dse/resources/spark/conf/hive-site.xml': Read-only file system
I need to pull someone in whose is more familiar with DSE for some help.
cc @bradfordcp
@jsanda my cass-operator version is v1.7.1, I only have one container, cassandra
For the initContainers, I can setup readOnlyRootFilesystem: true
podTemplateSpec:
spec:
initContainers:
- name: server-config-init
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
but for the containers I cannot setup readOnlyRootFilesystem: true
containers:
- name: "cassandra"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
@zhimsun can you share the logs from the cassandra container?
@jsanda The cassandra container did not create
oc exec -it cluster2-dc21-default-sts-0 -n zen bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulting container name to cassandra.
Use 'oc describe pod/cluster2-dc21-default-sts-0 -n zen' to see all of the containers in this pod.
error: unable to upgrade connection: container not found ("cassandra")
You can reproduce on your cluster input the systemLoggerImage, serverImage, configBuilderImage values
# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
name: dc21
spec:
nodeAffinityLabels:
beta.kubernetes.io/arch: amd64
clusterName: cluster2
serverType: dse
serverVersion: "6.8.14"
systemLoggerImage: <image>
serverImage: <image>
configBuilderImage: <image>
managementApiAuth:
insecure: {}
size: 1
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 1
memory: 4Gi
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: nfs-client
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
dockerImageRunsAsCassandra: false
podTemplateSpec:
spec:
initContainers:
- name: server-config-init
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
containers:
- name: "cassandra"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: true
hostIPC: false
hostNetwork: false
hostPID: false
securityContext:
runAsNonRoot: true
config:
jvm-server-options:
initial_heap_size: "800M"
max_heap_size: "800M"
additional-jvm-opts:
# As the database comes up for the first time, set system keyspaces to RF=3
- "-Ddse.system_distributed_replication_dc_names=dc21"
- "-Ddse.system_distributed_replication_per_dc=3"