dask-gateway
dask-gateway copied to clipboard
Using a Mapping config coupled to a KubeClusterConfig's environment and passing a non-string value cause a type error on startup
Just noting a UX issue. If you use the mapping provided in https://github.com/dask/dask-gateway/pull/290, to set environment variables, you should ensure that the values are strings.
If not, kubernetes will throw an error (in the controller pod) and from the user's point of view things just hang indefinitely.
[I 2021-04-15 02:07:01.388 KubeController] Creating scheduler pod for cluster staging.671dcd6379c844f5af9375c25ebf22d7
[W 2021-04-15 02:07:01.391 KubeController] Error while reconciling cluster staging.671dcd6379c844f5af9375c25ebf22d7
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/dask_gateway_server/backends/kubernetes/controller.py", line 586, in reconciler_loop
requeue = await self.reconcile_cluster(name)
File "/usr/local/lib/python3.8/site-packages/dask_gateway_server/backends/kubernetes/controller.py", line 607, in reconcile_cluster
status_update, requeue = await self.handle_cluster(cluster)
File "/usr/local/lib/python3.8/site-packages/dask_gateway_server/backends/kubernetes/controller.py", line 643, in handle_cluster
return await self.handle_pending_cluster(cluster)
File "/usr/local/lib/python3.8/site-packages/dask_gateway_server/backends/kubernetes/controller.py", line 661, in handle_pending_cluster
sched_pod_name, sched_pod = await self.create_scheduler_pod_if_not_exists(
File "/usr/local/lib/python3.8/site-packages/dask_gateway_server/backends/kubernetes/controller.py", line 942, in create_scheduler_pod_if_not_exists
pod = await self.core_client.create_namespaced_pod(namespace, pod)
File "/usr/local/lib/python3.8/site-packages/dask_gateway_server/backends/kubernetes/utils.py", line 59, in func
return await method(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/kubernetes_asyncio/client/api_client.py", line 180, in __call_api
response_data = await self.request(
File "/usr/local/lib/python3.8/site-packages/kubernetes_asyncio/client/rest.py", line 229, in POST
return (await self.request("POST", url,
File "/usr/local/lib/python3.8/site-packages/kubernetes_asyncio/client/rest.py", line 186, in request
raise ApiException(http_resp=r)
kubernetes_asyncio.client.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: <CIMultiDictProxy('Audit-Id': '631c873d-e7d5-4da8-b487-19c954288538', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Thu, 15 Apr 2021 02:07:01 GMT', 'Content-Length': '510')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod in version \"v1\" cannot be handled as a Pod: v1.Pod.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.v1.EnvVar.Value: ReadString: expects \" or n, but found 1, error found in #10 byte of ...|\"value\": 1}, {\"name\"|..., bigger context ...|_DISTRIBUTED__WORKERS__RESOURCES__GPU\", \"value\": 1}, {\"name\": \"NVIDIA_DRIVER_CAPABILITIES\", \"value\":|...","reason":"BadRequest","code":400}
The schema added in #429 will provide a clear feedback if the Helm chart is passed an environment variable that isn't a string.
helm template dg resources/helm/dask-gateway --set gateway.backend.environment.MY_ENV=123
Error: values don't meet the specifications of the schema(s) in the following chart(s):
dask-gateway:
- gateway.backend.environment.MY_ENV: Invalid type. Expected: string, given: integer
The Mapping widget won't know how it is used, but it will be used to configure something that could validate the parsed YAML from the Mapping widget though.
In this case, you error out by having a Mapping configuration coupled with the environment config, that requires strings. One clear action point is to make that configuration validate its input.
https://github.com/dask/dask-gateway/blob/HEAD/dask-gateway-server/dask_gateway_server/backends/base.py#L252-L257
Another more advanced and generic idea would be to let the Mapping itself allow for a schema to be attached.
Action point ideas
- [ ] Let KubeClusterConfig or perhaps ClusterConfig's environment traitlet validate its input
- [ ] Let the Mapping config class accept a JSONSchema to validates the YAML it receives