dask-kubernetes icon indicating copy to clipboard operation
dask-kubernetes copied to clipboard

Native Kubernetes integration for Dask

Results 114 dask-kubernetes issues
Sort by recently updated
recently updated
newest added

Dask scheduler could take a while to retire workers using `/api/v1/retire_workers` since this operation is synchronous from httpx's perspective the server is hanging up the connection thus producing a timeout...

bug

Currently, the operator has a hardcoded TCP protocol https://github.com/dask/dask-kubernetes/blob/92714da5785709726f85c4c6ec92451f5c23ad04/dask_kubernetes/operator/controller/controller.py#L155-L159

enhancement
help wanted
operator

Currently, the operator retires workers using the HTTP or RPC APIs however those only control the connected dask workers, the operator should take into count dask's Kubernetes worker pods that...

enhancement
help wanted
operator

Implement a way to set a cool-down period for adaptive scaling instead of the hardcoded https://github.com/dask/dask-kubernetes/blob/92714da5785709726f85c4c6ec92451f5c23ad04/dask_kubernetes/operator/controller/controller.py#L816 e.g. ```yaml apiVersion: kubernetes.dask.org/v1 kind: DaskCluster metadata: annotations: kubernetes.dask.org/cooldown-until-interval: "30s" name: dask-f3a0c12f namespace: default...

enhancement
help wanted
operator

I'm trying to setup a simple DaskAutoscaler on Kubernetes using YAML files, but somehow the auto scaler failes to be created with the following error ```bash Error Logging 45s kopf...

bug
operator

**Describe the issue**: Although the specification of the cluster is suggesting `int_or_type`, using integer probes raises an error, here's an example based on the documentation where the port `http_dashboard` is...

Right now our CRDs are at `v1` and haven't changed in a breaking way since they were introduced. However we are now at the point where we might want to...

operator

**Describe the issue**: As far as I know, this happened without so much as updating a dependency. When creating a KubeCluster, I get a stack trace saying my service account...

needs info

Not all cluster auth providers support refresh tokens. `KubeCluster` fails to instantiate without one out of cluster when parsing the Kubernetes configuration file. It would be helpful if there was...

bug
operator

When you create a DaskCluster (or other CR) the labels get propagated to the child resources. However if you update a DaskCluster with a label after it was created the...

operator