dask-kubernetes icon indicating copy to clipboard operation
dask-kubernetes copied to clipboard

Native Kubernetes integration for Dask

Results 114 dask-kubernetes issues
Sort by recently updated
recently updated
newest added

We are currently running dask on kubernetes. At the moment we are scaling the number of workers using metrics from the scheduler prometheus which return you the information about the...

Previously, if the service type was a NodePort, KubeCluster would attempt to discover the IP address by listing nodes. This is a set of permissions that not all users will...

This library manages pods directly via the Kubernetes API. This is a design decision we made for [many reasons](https://github.com/dask/dask-kubernetes/issues/168#issuecomment-517210364). People often ask why we didn't use a different resource type,...

enhancement
help wanted

**Describe the issue**: I'm consistently getting "kubectl port forward failed" from `port_forward_service` using the standard `KubeCluster` class to create an ad hoc cluster. When this happens the port forward is...

bug
kubecluster (classic)
needs info

Would it be possible to track the status of a job in the toplevel `DaskJob` CR? This would have the advantage of hiding the "implementation details" of the job from...

Installing the operator means applying 4 manifests (5 with #451). _Source: https://kubernetes.dask.org/en/latest/operator_installation.html_ Given that we are [automatically building the CRDs from templates](https://github.com/dask/dask-kubernetes/blob/main/ci/pre-commit-crd.py) we could extend this to finally concatenate all...

enhancement
help wanted
good first issue
operator

is it possible to change the service name having issues trying to connect my workers and other containers to the scheduler using dask-scheduler:8786 ? the dask-scheduler:8786 value is hardcoded in...

question
operator
needs info

The CI is taking nearly 30 minutes at the moment, experimenting with running tests in parallel to bring that time down. Initial timings: | Suite | Time (approx) | |...

**What happened**: KubeCluster times out when creating a cluster with NodePort service because it's looking for scheduler at 8786 when port is actually a randomized port (e.g. 32367). Note, the...

bug
kubecluster (classic)

Many tests start the kopf controller via the `kopf_runner` fixture and perform work within the context manager that it provides. ```python def test_foo(kopf_runner): # Start the controller with kopf_runner: #...

bug
operator