dask-gateway icon indicating copy to clipboard operation
dask-gateway copied to clipboard

Document generic install, configuration points

Open mrocklin opened this issue 4 years ago • 12 comments

There is mention of the dask_gateway_config.py file in various parts of the documentation, but I don't think there is any actual mention of how to create this file, or where it should live. Grepping in the codebase I found that this file could be generated from the CLI and was able to make progress, but it might be useful to others to add a note in the Configuration doc page.

mrocklin avatar Mar 16 '20 21:03 mrocklin

Because the admin practices for each backend are different (e.g. on kubernetes admins would never write that file, they'd use helm), there hasn't been a need for documenting the dask_gateway_config.py file directly - we've been relying on the admin walkthrough guides. I could see the use for document more internal specifics for a general install though. Is something like https://gateway.dask.org/install-hadoop.html#configure-dask-gateway-server (and the following steps) sufficient?

jcrist avatar Mar 16 '20 21:03 jcrist

Also note - most of the admin-side documentation is out of date since the rewrite. The process for everything non-kubernetes should remain the same, but the actual parameter field names to configure have changed.

jcrist avatar Mar 16 '20 21:03 jcrist

We also document every configurable field here (although these are also out of date since the rewrite): https://gateway.dask.org/api-server.html

jcrist avatar Mar 16 '20 21:03 jcrist

Is something like https://gateway.dask.org/install-hadoop.html#configure-dask-gateway-server (and the following steps) sufficient?

Yes, I think so. In the first few lines there it says where that file should live (/etc/...) so it's clear to me as a novice user that I should make an empty file there and copy-paste things in if I want to change them.

I ran into this when I was looking at https://gateway.dask.org/authentication.html#simple-authentication-for-testing and didn't know where the file was supposed to be that it was referring to. I then searched for and found the Configuration page.

(Also, unrelatedly, it's not clear how to use dummy auth from the user's side, although this may also be low priority given how it's unlikely to be useful in production. Happy to raise a separate issue if you think this is worth reporting.).

mrocklin avatar Mar 16 '20 21:03 mrocklin

Your case is a bit odd because you're looking at extending dask-gateway, and we have no docs about that. Our docs are mostly focused at users and admin, with walkthroughs for common user profiles. We should add a new page meeting your needs.


Also, unrelatedly, it's not clear how to use dummy auth from the user's side

In master it's now called SimpleAuthenticator (dummy had bad connotations). From the user's side they need to pass in a BasicAuth object to dask_gateway.Gateway or dask_gateway.GatewayCluster. Can be set once in the gateway.yaml (https://gateway.dask.org/configuration-user.html#default-configuration) or programmatically by passing a BasicAuth object through the auth kwarg.

import dask_gateway
gateway = dask_gateway.Gateway(auth=dask_gateway.BasicAuth())

jcrist avatar Mar 16 '20 22:03 jcrist

Your case is a bit odd because you're looking at extending dask-gateway

I haven't gotten there yet. I'm still just poking around.

gateway = dask_gateway.Gateway(auth=dask_gateway.BasicAuth())

Yeah, I got there eventually. Just reporting up doc issues. Feel free to disregard (or perhaps now that we've recorded the solution that will suffice for others).

mrocklin avatar Mar 16 '20 22:03 mrocklin

I'm going to keep this open to remind me to better document non-standard installs and configuration points.

jcrist avatar Mar 16 '20 22:03 jcrist

I also have some questions related to the initial comment on this thread, around dask_gateway_config.py. I'm a novice user, so my questions are probably pretty naive

I'm coming from having used the dask helm templates, but then having many different types of workers exploded my configs. In addition, I had a lot of trouble auto-scaling workers up/down using Kubernetes. Trying out dask-gateway was super simple -- simplified my configs, and my initial local tests with the auto-scaling were great.

My first question is around images. There is the gateway.backend.image in the helm repos, which seems to indicate the same image is used for worker and scheduler. Why does the scheduler use the same image as the worker? Is this the same image overridden by c.KubeClusterConfig.image?

What is the relationship between dask-worker and dask-gateway-worker? I've rolled my own images with dask-worker, but haven't come across dask-gateway-worker.

For thread-hostile work, I often have more cores than threads I expose to dask. How can I pass in '--nthreads' separate from worker_cores? Similarly, if I'd like to pass --no-nanny and --resources. Should this by done by updating worker_cmd in cluster options?

Is it possible to create a heterogenous cluster (workers of different types: bigmem, gpu, singlethreaded)?

chrisroat avatar Apr 29 '20 04:04 chrisroat

I'd also like to ask about this. I'm trying to expose cluster options for user configuration, and to my understanding, that should be done in the dask_gateway_config.py file. We are prototyping locally to offload out kubernetes devops until we seem to have a working solution to hand over to them. From what I've got from documentation, I've tried putting the dask_gateway_config.py both in ~/.config/dask, and in /etc/dask-gateway but I have not succeded in getting the local gateway server to pick them up. A little documentation here would be helpful.

jontis avatar May 08 '20 08:05 jontis

@jontis -- dask-gateway-server looks in . for a dask_gateway_config.py file by default. You can pass it a config file using -f, e.g.

dask-gateway-server -f my_config.py

gforsyth avatar May 08 '20 13:05 gforsyth

@gforsyth How does one decide between -f my_config.py and using extraConfig in the yaml file?

chrisroat avatar May 27 '20 16:05 chrisroat

@chrisroat -- I would only use -f my_config.py directly if I'm making changes to dask-gateway-server and need to test+run locally. For deployments, extraConfig is much easier to edit (and in the end, the context of the extraConfig block are appended to the dask_gateway_config.py file)

gforsyth avatar May 27 '20 17:05 gforsyth