dask-kubernetes icon indicating copy to clipboard operation
dask-kubernetes copied to clipboard

KubeCluster Input sanitising

Open jacobtomlinson opened this issue 1 year ago • 0 comments

In dask_kubernetes.operator.kubecluster we have make_cluster_spec, make_scheduler_spec and make_worker_spec. These are called by dask_kubernetes.operator.KubeCluster when creating a cluster or can be invoked directly and the output modified and passed to KubeCluster.

Today we aren't doing much in the way of input sanitisation in these functions, which means these functions can generate invalid manifests.

E.g in #665 it was raised that the cluster name can be set to a name that is invalid in Kubernetes like foo_bar which has underscores that are not allowed.

$ python -c 'from dask_kubernetes.operator import make_cluster_spec; import yaml; print(yaml.dump(make_cluster_spec(name="foo_bar")))' | kubectl apply --dry-run="server" -f -
The DaskCluster "foo_bar" is invalid: metadata.name: Invalid value: "foo_bar": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')

We should do more input validation on all the arguments that can be passed to those functions to ensure they will be valid input for the Kubernetes API.

jacobtomlinson avatar Mar 22 '23 14:03 jacobtomlinson