K8SSAND-1582 ⁃ If superuserSecretRef is changed, the operator should not revert it and instead report validation error
What happened? I deployed this K8ssandraCluster:
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: test
spec:
cassandra:
superuserSecretRef:
name: test-superuser
serverVersion: "4.0.3"
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
config:
jvmOptions:
heapSize: 1Gi
networking:
hostNetwork: true
datacenters:
- metadata:
name: dc1
size: 1
After the cluster became ready, I updated it with:
superuserSecretRef:
name: cassandra-superuser
This error was reported in the k8ssandra-operator log:
ERROR controller.k8ssandracluster SuperuserSecretName is immutable, reverting to existing value in CassandraDatacenter {"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "test", "namespace": "k8ssandra-operator", "K8ssandraCluster": "k8ssandra-operator/test", "CassandraDatacenter": "k8ssandra-operator/dc1", "K8SContext": "kind-k8ssandra-0", "error": "tried to update superuserSecretName in K8ssandraCluster"}
github.com/k8ssandra/k8ssandra-operator/controllers/k8ssandra.(*K8ssandraClusterReconciler).reconcile
/workspace/controllers/k8ssandra/k8ssandracluster_controller.go:133
github.com/k8ssandra/k8ssandra-operator/controllers/k8ssandra.(*K8ssandraClusterReconciler).Reconcile
/workspace/controllers/k8ssandra/k8ssandracluster_controller.go:87
superuserSecretRef was indeed reverted back to the original value. That's not the only thing that happened. The CassandraDatacenter spec for dc1 was updated as follows:
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
annotations:
k8ssandra.io/resource-hash: UAXz+S439NLRAsMQQsoMdCIdApONHtwCGGQv2pUKh+M=
creationTimestamp: "2022-06-16T22:19:24Z"
finalizers:
- finalizer.cassandra.datastax.com
generation: 2
labels:
app.kubernetes.io/component: cassandra
app.kubernetes.io/created-by: k8ssandracluster-controller
app.kubernetes.io/name: k8ssandra-operator
app.kubernetes.io/part-of: k8ssandra
k8ssandra.io/cluster-name: test
k8ssandra.io/cluster-namespace: k8ssandra-operator
name: dc1
namespace: k8ssandra-operator
resourceVersion: "53816"
uid: 504ab8a6-8d8d-4b16-bd31-2ad0e63b9687
spec:
additionalServiceConfig:
additionalSeedService: {}
allpodsService: {}
dcService: {}
nodePortService: {}
seedService: {}
clusterName: test
config:
cassandra-env-sh:
additional-jvm-opts:
- -Dcassandra.system_distributed_replication=dc1:1
- -Dcom.sun.management.jmxremote.authenticate=true
cassandra-yaml:
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
num_tokens: 16
role_manager: CassandraRoleManager
jvm-server-options:
initial_heap_size: 1073741824
max_heap_size: 1073741824
jvm11-server-options:
garbage_collector: G1GC
configBuilderResources: {}
managementApiAuth: {}
networking:
hostNetwork: true
podTemplateSpec:
metadata: {}
spec:
containers:
- name: cassandra
resources: {}
initContainers:
- args:
- /bin/sh
- -c
- echo "$SUPERUSER_JMX_USERNAME $SUPERUSER_JMX_PASSWORD" >> /config/jmxremote.password
env:
- name: SUPERUSER_JMX_USERNAME
valueFrom:
secretKeyRef:
key: username
name: cassandra-superuser
- name: SUPERUSER_JMX_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: cassandra-superuser
image: docker.io/library/busybox:1.34.1
imagePullPolicy: IfNotPresent
name: jmx-credentials
resources: {}
volumeMounts:
- mountPath: /config
name: server-config
resources: {}
serverType: cassandra
serverVersion: 4.0.3
size: 1
storageConfig:
cassandraDataVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: standard
superuserSecretName: test-superuser
systemLoggerResources: {}
Note the reference to the cassandra-superuser in the jmx-credentials init container.
Did you expect to see something different? Long term we need to let the superuser secret get updated. I opened https://github.com/k8ssandra/cass-operator/issues/350 to address it in cass-operator, and I will open a separate ticket to address it in k8ssandra-operator. I do not want to address it in this issue because updating the superuser secret requires some up front design and supporting changes in cass-operator.
We need to move forward with a short term fix for this issue.
There should be a status condition on the K8ssandraCluster to indicate whether or not it valid. As early as possible in the reconciliation process validation checks should be performed. If superuserSecretRef is updated, the valid condition should be set to false with a reason/message saying it cannot be updated. Note that I have suggested add the same validation check and condition for #575.
This check should also be added in the validating webhook.
How to reproduce it (as minimally and precisely as possible):
Deploy the above manifest. After the cluster becomes ready change superuserSecretRef to point a different secret. It doesn't have to exist.
Environment
-
K8ssandra Operator version:
v1.1.1
┆Issue is synchronized with this Jira Story by Unito
@jsanda
FYI - also getting messages like this in the k8ss operator for the new cluster which was created by providing the superuser secret.
cassandra:
serverVersion: "4.0.3"
superuserSecretRef:
name: control-plane-superuser
Message is quite frequent in the logs where we haven't changed any configs since cluster was stood up
1.6571984846312509e+09 ERROR controller.k8ssandracluster SuperuserSecretName is immutable, reverting to existing value in CassandraDatacenter {"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "dev-control-plane", "namespace": "temporal-state", "K8ssandraCluster": "temporal-state/dev-control-plane", "CassandraDatacenter": "temporal-state/secondary", "K8SContext": "", "error": "tried to update superuserSecretName in K8ssandraCluster"}
github.com/k8ssandra/k8ssandra-operator/controllers/k8ssandra.(*K8ssandraClusterReconciler).reconcile
/workspace/controllers/k8ssandra/k8ssandracluster_controller.go:133
github.com/k8ssandra/k8ssandra-operator/controllers/k8ssandra.(*K8ssandraClusterReconciler).Reconcile
/workspace/controllers/k8ssandra/k8ssandracluster_controller.go:87
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227
I'll answer my own question... we have 2 C* clusters in the same k8s namespace across 2 k8s clusters but we have named the datacenters the same way - primary and secondary.
This resulted in a configuration overlap as cassandra DC name is the same per namespace hence operator was trying to mess with the same DC name across 2 clusters...
Not sure if you need a separate issue for that but this may cause some grief
Not sure if you need a separate issue for that but this may cause some grief
Is this #615?
Yes, missing to use unique DC per cluster caused the conflicting updates which resulted in the error message so this is indeed 615. Once DC was made unique there is no more issue like that, although there is another issue with the seed nodes not being applied when additional seeds is provided which I will ask separately.