dask-gateway question: granular cluster resource limits based on user / Jupyter RBAC

question: granular cluster resource limits based on user / Jupyter RBAC

Open lukasheinrich opened this issue 4 years ago • 3 comments

Hi,

dask-gateway allows setting resource limits on the clusters that can be created by users .. https://gateway.dask.org/resource-limits.html

but they are global for all users. for a multi-tenant deployment it is often the case that there are various user groups to which different limits might apply.

Jupyter Lab 2.0 introduces some notion of RBAC-based authentication

and I was wondering whether that could be used fo set more granular settings in dask-gateway?

perhaps @consideRatio has thoughts here as well

Oct 15 '21 09:10 lukasheinrich

Thanks for opening this issue.

The resource-limit settings in dask gateway are actually per-cluster (e.g. max cores per cluster). Setting them as described in that doc set global defaults, but those can be overridden by per-user configuration in a options_handler. This takes any options specified by the user when creating the cluster, and the user object itself (see here), and returns a new set of options to apply to that cluster. This flexibility allows for defining whatever per-user/group rules you want, without having to bake those in to dask-gateway itself. For example:

from dask_gateway_server.options import Options

def options_handler(options, user):
    # Users in the `power-users` group get bigger clusters with higher limits
    if "power-users" in user.groups:
        return {
            "worker_cores": 8,
            "worker_memory": "16 G",
            "cluster_max_workers": 100,
        }
    else:
        return {
            "worker_cores": 4,
            "worker_memory": "8 G",
            "cluster_max_workers": 10,
        }

c.Backend.cluster_options = Options(handler=options_handler)

Note that when authenticating with JupyterHub, the .groups field mirrors that of JupyterHub.

Oct 15 '21 13:10 jcrist

Excellent this is exactly what we need - is there an option to set a max cluster lifetime after which the cluster will be culled?

Oct 15 '21 14:10 lukasheinrich

There's idle_timeout (https://gateway.dask.org/api-server.html#c.ClusterConfig.idle_timeout), a max time for the cluster to sit idle (unused) before it's culled, but there isn't a total max runtime for the cluster itself. That wouldn't be too tricky to add though if it'd be useful for you. File an issue if so.

Oct 15 '21 14:10 jcrist

dask-gateway dask-gateway copied to clipboard

question: granular cluster resource limits based on user / Jupyter RBAC

dask-gateway
dask-gateway copied to clipboard