dask-labextension icon indicating copy to clipboard operation
dask-labextension copied to clipboard

Listing DaskGateway clusters created via Python code alongside those created via the dask labextension UI

Open consideRatio opened this issue 4 years ago • 9 comments

What happened:

I can create a dask-gateway cluster via the dask-labextension view and I'll see it visible there then.

starting-new-dask-cluster

But, if I create a dask-gateway cluster from a notebook using code like below, then no dask cluster shows up in the list of clusters.

from dask_gateway import Gateway
gateway = Gateway()
cluster = gateway.new_cluster()

My wish

My wish is that the dask clusters I've created should be listed visually. I'm not sure if this is possible or not, but I'd like to describe this wish here to explore if we can make it happen one way or another.

Environment:

JupyterHub (1.1.1 Helm chart) + Dask-Gateway (0.9.0 Helm chart).

$ conda list | grep dask
dask                      2021.6.0           pyhd8ed1ab_0    conda-forge
dask-core                 2021.6.0           pyhd8ed1ab_0    conda-forge
dask-gateway              0.9.0            py38h578d9bd_0    conda-forge
dask-glm                  0.2.0                      py_1    conda-forge
dask-kubernetes           2021.3.1           pyhd8ed1ab_0    conda-forge
dask-labextension         5.0.2              pyhd8ed1ab_0    conda-forge
dask-ml                   1.9.0              pyhd8ed1ab_0    conda-forge
pangeo-dask               2021.06.05           hd8ed1ab_0    conda-forge
$ python --version
Python 3.8.10

Operating System: Ubuntu 20.04 Install method: conda-forge

# The current environment and dask configuration via environment
DASK_DISTRIBUTED__DASHBOARD_LINK=/user/{JUPYTERHUB_USER}/proxy/{port}/status
DASK_GATEWAY__ADDRESS=http://10.100.116.39:8000/services/dask-gateway/
DASK_GATEWAY__AUTH__TYPE=jupyterhub
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE={JUPYTER_IMAGE_SPEC}
DASK_GATEWAY__PROXY_ADDRESS=gateway://traefik-prod-dask-gateway.prod:80
DASK_GATEWAY__PUBLIC_ADDRESS=/services/dask-gateway/
DASK_LABEXTENSION__FACTORY__CLASS=GatewayCluster
DASK_LABEXTENSION__FACTORY__MODULE=dask_gateway
DASK_ROOT_CONFIG=/srv/conda/etc

consideRatio avatar Jul 28 '21 01:07 consideRatio

Thanks for raising this @consideRatio . In general, this is a hard problem, as dask doesn't really have a built-in cluster discovery method. Short of port sniffing, I'm not sure I know of a good way to handle auto-detecting any cluster in a given notebook (or set of notebooks). Indeed, part of the reason for creating the cluster manager sidebar in the first place was to be able to build some user interfaces around starting, stopping, and scaling clusters that the extension can actually keep track of and reason about.

That being said, my goal for this extension is to get out of the game of managing clusters directly, and instead investigate a solution like dask-ctl. This could allow different cluster providers to set up their own discovery and control services, which the labextension could then consume. There is some detailed discussion of this in #189, I encourage you to weigh in!

ian-r-rose avatar Jul 28 '21 09:07 ian-r-rose

I would really like dask-ctl to be the solution for this.

jacobtomlinson avatar Jul 28 '21 11:07 jacobtomlinson

Related question. How do you configure the lab extension to use dask-gateway for creating new clusters. I can't find that anywhere in the docs but clearly from the screenshot above it is possible.

dharhas avatar Apr 13 '23 18:04 dharhas

@dharhas I haven't tried it recently myself, but the configuration that @consideRatio posted above looks like the correct approach to me (though it could also be configured using a yml file or what have you):

# The current environment and dask configuration via environment
DASK_DISTRIBUTED__DASHBOARD_LINK=/user/{JUPYTERHUB_USER}/proxy/{port}/status
DASK_GATEWAY__ADDRESS=http://10.100.116.39:8000/services/dask-gateway/
DASK_GATEWAY__AUTH__TYPE=jupyterhub
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE={JUPYTER_IMAGE_SPEC}
DASK_GATEWAY__PROXY_ADDRESS=gateway://traefik-prod-dask-gateway.prod:80
DASK_GATEWAY__PUBLIC_ADDRESS=/services/dask-gateway/
DASK_LABEXTENSION__FACTORY__CLASS=GatewayCluster
DASK_LABEXTENSION__FACTORY__MODULE=dask_gateway
DASK_ROOT_CONFIG=/srv/conda/etc

In particular, the factory class and factory module options tell the labextension what to use when starting a new cluster.

ian-r-rose avatar Apr 14 '23 19:04 ian-r-rose