dask-kubernetes
dask-kubernetes copied to clipboard
Readiness/Liveness probes do not accept integer port
Describe the issue:
Although the specification of the cluster is suggesting int_or_type
, using integer probes raises an error, here's an example based on the documentation where the port http_dashboard
is 8786
, basically:
readinessProbe:
httpGet:
port: http-dashboard
path: /health
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
port: http-dashboard
path: /health
initialDelaySeconds: 15
periodSeconds: 20
is replaced with this:
readinessProbe:
httpGet:
port: 8786
path: /health
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
port: 8786
path: /health
initialDelaySeconds: 15
periodSeconds: 20
If you check the type definition of the probes, e.g. python definition https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1HTTPGetAction.md, you will notice that it's of type object and accepts string or integer, here's also the kubernetes docs: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-http-request
Full example:
apiVersion: kubernetes.dask.org/v1
kind: DaskJob
metadata:
name: simple-job
namespace: default
spec:
job:
spec:
containers:
- name: job
image: "ghcr.io/dask/dask:latest"
imagePullPolicy: "IfNotPresent"
args:
- python
- -c
- "from dask.distributed import Client; client = Client(); # Do some work..."
cluster:
spec:
worker:
replicas: 2
spec:
containers:
- name: worker
image: "ghcr.io/dask/dask:latest"
imagePullPolicy: "IfNotPresent"
args:
- dask-worker
- --name
- $(DASK_WORKER_NAME)
- --dashboard
- --dashboard-address
- "8788"
ports:
- name: http-dashboard
containerPort: 8788
protocol: TCP
env:
- name: WORKER_ENV
value: hello-world # We dont test the value, just the name
scheduler:
spec:
containers:
- name: scheduler
image: "ghcr.io/dask/dask:latest"
imagePullPolicy: "IfNotPresent"
args:
- dask-scheduler
ports:
- name: tcp-comm
containerPort: 8786
protocol: TCP
- name: http-dashboard
containerPort: 8787
protocol: TCP
readinessProbe:
httpGet:
port: 8786
path: /health
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
port: 8786
path: /health
initialDelaySeconds: 15
periodSeconds: 20
env:
- name: SCHEDULER_ENV
value: hello-world
service:
type: ClusterIP
selector:
dask.org/cluster-name: simple-job
dask.org/component: scheduler
ports:
- name: tcp-comm
protocol: TCP
port: 8786
targetPort: "tcp-comm"
- name: http-dashboard
protocol: TCP
port: 8787
targetPort: "http-dashboard"
Anything else we need to know?:
The error during the submission:
spec.cluster.spec.scheduler.spec.containers[0].readinessProbe.httpGet.port: Invalid value: "integer": spec.cluster.spec.scheduler.spec.containers[0].readinessProbe.httpGet.port in body must be of type string: "integer"
Environment:
- Dask version: latest
- Dask Kubernets operator