Operator is not resilient to missing spec.cassandra.serverVersion
What happened?
The operator is not resilient to cases where the spec.cassandra.serverVersion is nil.
Did you expect to see something different?
Even malformed CRs should not cause crashes. This could lead to a DOS issue in multi-tenant clusters.
How to reproduce it (as minimally and precisely as possible):
When creating a K8ssandraCluster, it appears that I can crash the operator by submitting the following CR:
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: test
namespace: k8ssandra-operator
spec:
cassandra:
datacenters:
- metadata:
name: dc1
size: 1
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
config:
jvmOptions:
heapSize: 384Mi
┆Issue is synchronized with this Jira Story by Unito
I suggest we should vet this at the webhook layer and then again via the controllers.
Can't we make ServerVersion mandatory to make this simpler, by using // +kubebuilder:validation:Required?
That does validation at the OpenAPI layer right? That's probably even better than the webhook, agreed.
Or just have the regexp there also? This is what cass-operator does:
// +kubebuilder:validation:Pattern=(6\.8\.\d+)|(3\.11\.\d+)|(4\.\d+\.\d+)
ServerVersion string `json:"serverVersion"`
We have it already: https://github.com/k8ssandra/k8ssandra-operator/blob/main/apis/k8ssandra/v1alpha1/k8ssandracluster_types.go#L310
Right, but there's "omitempty", which allows there to be a nil.
good point 👍 that's what we need to fix then.