dask-tutorial
dask-tutorial copied to clipboard
Auto-setup cluster for users
We'd like to use the distributed dashboard without necessarily exposing the user to dask.distributed, Client, or schedulers. How can we do that?
Presumably, we can execute a bit of code in the background for the user via jupyterlab. We would need
- Start a Dask cluster, either one time when jupyterlab is launched, or when each notebook is opened and a kernel is started.
- Create a client in each kernel session.
Both are solved by dask-labextension, which anticipated this need.
For 1 you can create an initial cluster in the configuration.
For 2 you need to use the auto-start dask option
On Mon, Dec 9, 2019 at 10:38 AM Tom Augspurger [email protected] wrote:
We'd like to use the distributed dashboard without necessarily exposing the user to dask.distributed, Client, or schedulers. How can we do that?
Presumably, we can execute a bit of code in the background for the user via jupyterlab. We would need
- Start a Dask cluster, either one time when jupyterlab is launched, or when each notebook is opened and a kernel is started.
- Create a client in each kernel session.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AACKZTBLRJNNWVZLYUG56E3QX2GCHA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H7FMMPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTESEAGV6GCJ6JQKIRDQX2GCHANCNFSM4JYO6JKA .
https://github.com/dask/dask-labextension#configuration-of-dask-cluster-management
On Mon, Dec 9, 2019 at 10:51 AM Matthew Rocklin [email protected] wrote:
Both are solved by dask-labextension, which anticipated this need.
For 1 you can create an initial cluster in the configuration.
For 2 you need to use the auto-start dask option
On Mon, Dec 9, 2019 at 10:38 AM Tom Augspurger [email protected] wrote:
We'd like to use the distributed dashboard without necessarily exposing the user to dask.distributed, Client, or schedulers. How can we do that?
Presumably, we can execute a bit of code in the background for the user via jupyterlab. We would need
- Start a Dask cluster, either one time when jupyterlab is launched, or when each notebook is opened and a kernel is started.
- Create a client in each kernel session.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AACKZTBLRJNNWVZLYUG56E3QX2GCHA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H7FMMPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTESEAGV6GCJ6JQKIRDQX2GCHANCNFSM4JYO6JKA .
Thanks. Looking into that now.
On Mon, Dec 9, 2019 at 12:54 PM Matthew Rocklin [email protected] wrote:
https://github.com/dask/dask-labextension#configuration-of-dask-cluster-management
On Mon, Dec 9, 2019 at 10:51 AM Matthew Rocklin [email protected] wrote:
Both are solved by dask-labextension, which anticipated this need.
For 1 you can create an initial cluster in the configuration.
For 2 you need to use the auto-start dask option
On Mon, Dec 9, 2019 at 10:38 AM Tom Augspurger <[email protected]
wrote:
We'd like to use the distributed dashboard without necessarily exposing the user to dask.distributed, Client, or schedulers. How can we do that?
Presumably, we can execute a bit of code in the background for the user via jupyterlab. We would need
- Start a Dask cluster, either one time when jupyterlab is launched, or when each notebook is opened and a kernel is started.
- Create a client in each kernel session.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AACKZTBLRJNNWVZLYUG56E3QX2GCHA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H7FMMPA , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AACKZTESEAGV6GCJ6JQKIRDQX2GCHANCNFSM4JYO6JKA
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AAKAOIXMTIJD27LIPG7Z3C3QX2H6LA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKHYNA#issuecomment-563379252, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIS6TJ2B7F5HG2QJKYLQX2H6LANCNFSM4JYO6JKA .
There are some active issues though: https://github.com/dask/dask-labextension/issues/84
See also https://github.com/pangeo-data/pangeo/issues/743
On Mon, Dec 9, 2019 at 11:16 AM Tom Augspurger [email protected] wrote:
Thanks. Looking into that now.
On Mon, Dec 9, 2019 at 12:54 PM Matthew Rocklin [email protected] wrote:
https://github.com/dask/dask-labextension#configuration-of-dask-cluster-management
On Mon, Dec 9, 2019 at 10:51 AM Matthew Rocklin [email protected] wrote:
Both are solved by dask-labextension, which anticipated this need.
For 1 you can create an initial cluster in the configuration.
For 2 you need to use the auto-start dask option
On Mon, Dec 9, 2019 at 10:38 AM Tom Augspurger < [email protected]
wrote:
We'd like to use the distributed dashboard without necessarily exposing the user to dask.distributed, Client, or schedulers. How can we do that?
Presumably, we can execute a bit of code in the background for the user via jupyterlab. We would need
- Start a Dask cluster, either one time when jupyterlab is launched, or when each notebook is opened and a kernel is started.
- Create a client in each kernel session.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <
https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AACKZTBLRJNNWVZLYUG56E3QX2GCHA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H7FMMPA
,
or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AACKZTESEAGV6GCJ6JQKIRDQX2GCHANCNFSM4JYO6JKA
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AAKAOIXMTIJD27LIPG7Z3C3QX2H6LA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKHYNA#issuecomment-563379252 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAKAOIS6TJ2B7F5HG2QJKYLQX2H6LANCNFSM4JYO6JKA
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dask/dask-tutorial/issues/143?email_source=notifications&email_token=AACKZTBKWBXAXIZSIPRLL73QX2KPJA5CNFSM4JYO6JKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKKFAA#issuecomment-563389056, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTAO3UFSTW3SYS3ACV3QX2KPJANCNFSM4JYO6JKA .
Thanks. I think I'm struggling a bit with how to "distribute" this config setting. I think this will be fine for running the tutorial on binder. We'll just dump
{
"dask-labextension:plugin" {
"autoStartClient": true
}
}
in the right location.
I'm struggling a bit with when users are running this locally. @ian-r-rose, do you know if it's possible to override jupyterlab settings from the command line? Ideally, we just have overrides.json in this repository, and instruct people running it locally to start their server with jupyter lab --settings=overrides.json (--settings doesn't exist though). There is a --config though.
Hmm, we could make the tutorial a jupyterlab "app" and have them run jupyter lab --app-dir=. or something like that. Will see if it's possible t3o do that.
@TomAugspurger I don't think there is a good way to override specific settings from the CLI using a JSON file. You can, however, point to a custom settings directory, either by setting a new "app" directory as you suggest, or by setting the JUPYTERLAB_SETTINGS_DIR environment variable: https://github.com/jupyterlab/jupyterlab/blob/a1fc14a5971c780a385ed994c1fa99e9d3ed6730/jupyterlab/commands.py#L122-L129
And the restart issue has now been bumped in my personal queue.