dask
                                
                                 dask copied to clipboard
                                
                                    dask copied to clipboard
                            
                            
                            
                        Add new CLI that is extensible
New CLI discussed at SciPy sprint
This creates a new dask command line tool that can be augmented by external packages (e.g. distributed)  with a [dask_cli] entry point. The new tool can be used with either
$ dask
or
$ python -m dask
Adding an entry point is pretty straightforward; we'll use distributed as an example:
In an external-to-dask module at the location: distributed.cli.dask_scheduler with a function called main we can add to pyproject.toml (PEP-621):
[project.entry-points."dask_cli"]
scheduler = "distributed.cli.dask_scheduler.main"
or an "old school" setup.py or setup.cfg
[options.entry_points]
dask_cli =
    scheduler = distributed.cli.dask_scheduler:main
The function needs to be a click command:
# in a distributed/distributed/cli/dask_scheduler.py file
@click.command(name="scheduler")
@click.option(...)
@click.option(...)
def main():
    ...
Since the scheduler = ... entry point under the dask_cli section is defined in the distributed package; upon import of dask.cli the run_cli function will discover the external scheduler command. This allows use of
$ dask scheduler
or
$ python -m dask scheduler
Related PRs in distributed:
- https://github.com/dask/distributed/pull/6735
- https://github.com/dask/distributed/pull/6738
pkg_resources is part of setuptools  (news to me!) and its use is "discouraged" in favor of importlib from the stdlib. (so developers can use whatever packaging library they prefer, not locked into setuptools).
Hey @jrbourbeau! The motivation is a bit multidimensional. In general a group of Dask contributors that were at the SciPy sprint last week discussed improving CLI based on some feedback and a desire to bring CLI tooling under one extensible umbrella. @jacobtomlinson adds some detail on a PR in distributed here: https://github.com/dask/distributed/pull/6738#issuecomment-1188767573
From my perspective this needs docs, and then we should drop anything that isn't implemented (info). I'm good to merge after that.
Matthew Rocklin @.***> writes:
From my perspective this needs docs, and then we should drop anything that isn't implemented (info). I'm good to merge after that.
OK that also sounds good to me
we should drop anything that isn't implemented (info).
Just for reference, info is actually being used as a group that can have command children. So far we are using it for
$ dask info versions
(which is calling dask.utils.show_versions). We can add more commands under the info umbrella, and so can other libraries/apps with
from dask.cli import info
import click
@click.command(name="fun")
def fun():
    print(42)
    
info.add_command(fun)
to get
$ dask info fun
42
Do we want to make click a hard dependency?
I don't think we should. I'd be interested to explore how much work it would be to also support existing tools that just use argparse, is there some kind of type check we can do and somehow wrap things if they aren't click? It'd be nice for subprojects to choose whatever CLI tool they like, they might want to try typer for example.
pkg_resources is part of setuptools (news to me!) and its use is "discouraged" in favor of importlib from the stdlib. (so developers can use whatever packaging library they prefer, not locked into setuptools).
Ooh this is news to me too! We should definitely switch out pkg_resources in dask-ctl too.
I'd be interested to explore how much work it would be to also support existing tools that just use
argparse, is there some kind of type check we can do and somehow wrap things if they aren't click? It'd be nice for subprojects to choose whatever CLI tool they like, they might want to try typer for example.
Typer apps can be used in Click apps, but I haven't been able to find anything that connects argparse and Click. Since the existing tools already use Click, I'm inclined to keep that as the foundation. If we don't make Click a hard dependency, perhaps we can make $ dask usage without click just print a you need to pip or conda install click message.
Awesome!
Since the existing tools already use Click
Not all of them do. Dask Yarn and Dask Gateway both have argparse CLI tools.
We could also use Click and not make it a required dependency. If it wasn't installed we would raise an informative error message.
On Tue, Jul 26, 2022 at 4:25 AM Jacob Tomlinson @.***> wrote:
Typer apps can be used in Click apps https://typer.tiangolo.com/tutorial/using-click/#including-a-typer-app-in-a-click-app
Awesome!
Since the existing tools already use Click
Not all of them do. Dask Yarn and Dask Gateway both have argparse CLI tools.
— Reply to this email directly, view it on GitHub https://github.com/dask/dask/pull/9283#issuecomment-1195238733, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTACNIZTS5V6PXHQIZTVV6VHVANCNFSM53YJXTGA . You are receiving this because you commented.Message ID: @.***>
I think the challenge here is dispatching to non-Click modules.
OK. I'll pull back then. Mostly what I wanted to communicate is "I think that we somewhat comfortable with common dependencies". If that's not the issue then please ignore.
On Tue, Jul 26, 2022 at 8:11 AM Jacob Tomlinson @.***> wrote:
I think the challenge here is dispatching to non-Click modules.
— Reply to this email directly, view it on GitHub https://github.com/dask/dask/pull/9283#issuecomment-1195465604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTE4CE5P6G4KD3K2UVLVV7PZJANCNFSM53YJXTGA . You are receiving this because you commented.Message ID: @.***>
Yeah totally, thanks for the clarification. I think @jcrist was concerned that we would need to update existing CLI tools to work with this. I agree that however this is implemented should integrate with existing tools just by adding the entrypoint and not require converting everything to Click or something.
Not all of them do. Dask Yarn and Dask Gateway both have argparse CLI tools.
Ah I did not realize more than what's in distributed exists, thanks for pointing that out! It looks like it would not be too difficult to convert the CLI in dask-yarn to Click. I wasn't able to find argparse use in dask-gateway. After more digging I'm still striking out at finding anything on click consuming existing argparse objects. I haven't tested this, but perhaps dask-yarn can create a dask yarn click command that just calls dask-yarn.
At this point the PR has the following behavior for two scenarios where click isn't installed. I'm also happy to remove these in favor of just making Click a dependency.
- Try to run dask:
(scratch) ddavis@charm ~/software/repos/dask (new-cli) $ dask
The dask command requires click to be installed.
Install with conda or pip:
 conda install click
 pip install click
- Try to import the dask.climodule (a situation where someone wants to expand the CLI):
>>> import dask.cli
Traceback (most recent call last):
  File "/Users/ddavis/software/repos/dask/dask/cli.py", line 7, in <module>
    import click
ModuleNotFoundError: No module named 'click'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ddavis/software/repos/dask/dask/cli.py", line 15, in <module>
    raise ImportError(msg) from e
ImportError: The dask.cli module requires click to be installed.
Install with conda or pip:
 conda install click
 pip install click
Of course if someone has used pip install dask[complete] or pip install dask[distributed] they won't see these errors.
Yeah @jcrist voiced concerns about having to do that, I'm just echoing them here as he might not have time to engage here.
Gateway starts a Traitlets application from its CLI command, which uses argparse.
It would be really nice if we could support this without enforcing Click on those projects.
I'm happy with the dask command requiring Click and the import errors seem totally fine, but maybe we can find a way to do the wrapping here instead of asking other projects to do it.
Alright I think this is ready for review. The partner PR (dask/distributed#6735) should go in with this one, but it relies on installing my dask branch associated with this PR (I've changed all of the CI to use dask at version douglasdavis/dask@new-cli). I'm guessing this situation requires:
- This PR is approved and ready to go (don't merge but also don't make any more changes)
- Then review dask/distributed#6735
- Remove dask/distributed#6735 dependence on this PR's branch
- Merge both PRs together
thanks @jacobtomlinson! Added a few commits to address those comments