sisl WIP: Converting `sisl_toolbox` to typer.

Related to #606.

This PR tries to convert the sisl_toolbox CLI to typer. More importantly, it tries to create a framework for sisl CLIs to be created from type hints. Following Nick's suggestion, the work is split in two steps:

[x] Convert the atom CLI.
[x] Convert the poisson CLI and merge it with the atom one.

How it looks

This is one of the strongest points of typer (thanks to rich). I think it's indisputable that it makes for a clearer and nicer looking CLI:

Before

Screenshot from 2023-08-09 16-22-10

Screenshot from 2023-08-09 16-29-24

After

Screenshot from 2023-08-09 16-22-36

Screenshot from 2023-08-09 16-29-35

How it works

While working out the details, I found some advantages and disadvantages:

Advantages

I think the code looks cleaner by declaring everything as a normal function with type hints (apart from the wrapper).
The function can be used from inside python very easily, not just from a CLI. This means that, for example, probably the same function will also be easy to use from an API.
With the wrapper, parsing of arguments will be centralized so the CLIs in sisl will all behave predictably and we can introduce complex parsing for custom types (e.g. Geometry, AtomsArgument) that all CLIs would use.

Disadvantages

I didn't particularly like the inner workings of click as they are a bit restrictive in some cases. For example, it doesn't allow an undefined number of values for options (it does for arguments) out of the box. E.g. stoolbox --plot wavefunction log is not allowed. They recommend instead stoolbox --plot wavefunction --plot log, which maybe is not that bad if you provide a shorthand for --plot, like -P, but I don't know.

Aug 09 '23 14:08 pfebrer

I have had a quick glance here.

An error is very hard to backtrack as the call stack traces to some obscure, try and do stoolbox log --l g --show.
We cannot use Union in argument types, this is a significant limitation IMHO. For instance it would be ideal if the --input could accept both str and Path, but this will fail. One should then change str to Path, which works nice.
I really like the clear option output, but I also dislike the shell completion options, it may be superfluous information, and I don't know about grouping this in another sub-category.
I agree that the other options are really nicely formatted etc. But perhaps this would be easier to accommodate through click, instead of going all the way to typer? Another point on this is that typer got its last commit in start of august, yet is used/starred by A LOT of projects. I am afraid the maintainer is drowning in requests from users who are not stepping up... :(

It just somehow feels a bit clunky to me, and I am afraid typer needs more maintainers to be sustainable... :(

I am thus a bit on the fence here... :(

Sep 20 '23 13:09 zerothi

Do you mean because it shows the values of the local variables? This can be deactivated by passing pretty_exceptions_show_locals=False to the typer app.
How would you support Union types? You would try to parse the types sequentially until you succeed? Or would you just go with the first one? Do you do this in argparse or is it something extra that you are requesting typer to do? Of course you can always implement some extra behaviour to handle Union types.
This can also be disabled by passing add_completion=False to the typer app.
If we went with click, that would mean we would need to implement a new typer 😅 You might as well do it with argparse then. It's true that the maintainer doesn't seem to be adding new features, but you could say the same about argparse, the last commit is from 3 months ago. I don't know if a CLI package needs that much updating/bug fixing...

Sep 21 '23 07:09 pfebrer

Ok, great
The main problem is that the tooltip helpers can be misleading. For instance I wanted the app to clarify that the input is a PATH like object. But perhaps this is a minor detail, it can be wrapped in a CLI method with the wanted tooltip helpers. I just don't know how it would look for something that accepts both strings and integers, say for direction we generally allow 'a' or 0 for the directions.
Ok.
True. :) But the point was that I can't foresee whether the typing hints will be preventive going further? Take for instance sgeom which has a large amount of options. As for argparse, it is a stdlib library, and they are restricted to a much larger extent.

My 2nd worry is that the sgeom and friends commands are order dependent. This is necessary for obvious reasons:

sgeom test.fdf --tile 2 2 --remove 2
sgeom test.fdf --remove 2 --tile 2 2

Can typer/click do this? If not, I am even more hesitant.. :(

And the current PR test, does not use the subcommand arguments, could you make this work? E.g. stoolbox atom Perhaps the above needs clarification before we endeavour into click...

I had a look here: https://click.palletsprojects.com/en/8.1.x/advanced/#callback-evaluation-order and there it seems that multiple options are collected into one argument and executed at the first invocation, hence:

sgeom test.fdf --remove 2 --tile 2 2 --remove 10

would be equivalent to

sgeom test.fdf --remove 2 --remove 10 --tile 2 2

:(

Sep 21 '23 07:09 zerothi

I think you can parse the options by order, but I don't know how to do it yet (I don't have time to look at it now).

How would you do it with raw argparse? With typer you would just probably need to define a specific parser for Union[str, int] or even the custom type Direction I don't think there's a generic solution that would fit all possible cases.
Hmm I don't think so, whatever typehint that we don't support we could automatically remove its argument (basically as you manually do now).

Sep 21 '23 17:09 pfebrer

And the current PR test, does not use the subcommand arguments, could you make this work? E.g. stoolbox atom

This is how it is implement it now with the atom-plot subcommand, the thing is that if there's only one subcommand by default typer uses it as a default. (but this can also be changed)

Sep 21 '23 17:09 pfebrer

I think you can parse the options by order, but I don't know how to do it yet (I don't have time to look at it now).
2. How would you do it with raw `argparse`? With typer you would just probably need to define a specific parser for `Union[str, int]` or even the custom type `Direction` I don't think there's a generic solution that would fit all possible cases.

In argparse you don't define the type (necessarily), you just parse it in the action. So you never define all the possibilities.

3. Hmm I don't think so, whatever typehint that we don't support we could automatically remove its argument (basically as you manually do now).

yeah, ok.

And the current PR test, does not use the subcommand arguments, could you make this work? E.g. stoolbox atom

This is how it is implement it now with the atom-plot subcommand, the thing is that if there's only one subcommand by default typer uses it as a default. (but this can also be changed)

Ok, that was not clear to me ;)

So for now the current blocker is the order of execution. Once that is cleared, we may consider this. :) Lets keep this open for now.

Sep 21 '23 18:09 zerothi

In argparse you don't define the type (necessarily), you just parse it in the action. So you never define all the possibilities.

You basically define them, but just in code that the user has no way to know how it works :) And you can have a different parser for very similar situations. So if you want to make it clear you need to document each argument.

The point of using typehints for that is to make it more clear by pairing one typehint to one parser. So for example Union[str, int] should always be parsed the same, and if you want "1" to be parsed as 1 you would need to type hint it as Direction. I think it is nice because it forces us to be more precise/correct with the typehints, which for sure will be benefitial (e.g. for debugging).

So for now the current blocker is the order of execution. Once that is cleared, we may consider this. :)

Ok, so you don't want to have a typer CLI for the toolbox while the main one is argparse, no?

By the way, what might be a blocker is that the way you have sgeom now arguments are actually functions that contain arguments themselves (only positional). I don't think that will be straightforward to automatically document. I have seen that typer can automatically chain subcommands. So if instead of actions being options (sgeom --remove) they were subcommands (sgeom remove), that would be more straightforward to implement and document and would allow keyword options for the actions. Haven't you missed at some point the possibility to have keyword arguments for the actions?

Sep 21 '23 19:09 pfebrer

I can't really figure this out. The idea of the toolboxes is that they are allowing external users to contribute toolboxes. And easily hook into the stoolbox CLI. If this gets too complicated, then that will never be adopted (it hasn't yet, so I don't have a good faith in this ever attracting attention, and using typer is a bit heavy for new users, documentation is limited, not a lot of help to get around).

I don't think using typer in toolboxes, but argparse in sisl is a good idea. Everything should be one CLI.

I am not set in stone on sgeom and changing their flags to subcommands. Say

sgeom read|geometry|G RUN.fdf tile -n 2 --dir x append block read RUN.fdf tile --dir x -n 2 endblock --dir x

or similar. But it would require substantial changes, and this is something I am a bit worried about...

I am not saying the current way is good, or better than your proposal. I am merely stating the effort to change is significant, with (IMHO) little gain.

I have just tried to play a bit with typer.

I can't find any API of the typer.Option? So I have no idea what to use, what is possible... I even searched on the doc site... Nothing... Looking around it for 2 hours is annoying... And the flags you mention to add, I can't find them? I searched for add_completion, no hits. The other one I get a hit, but I would have never found it if I didn't know how to search for it.
Is there a particular reason for wrapping typer.Option with CLIOption? What is the benefit here? Seems to me that you are just passing everything anyways. Also, your annotate_typer seems to not work when using typer.Option explicitly. I can see you are updating the annotation in all cases, why is this required? :) Wouldn't it be fine if there are options not annotated? typer could handle this.

I agree that the way things look in typer is better, but I am truly afraid of the transitioning state, and the limited gains. We could equally annotate the current https://github.com/zerothi/sisl/blob/f845e957f1540c56d3a46caac874568c99603d9e/src/sisl_toolbox/siesta/atom/_atom.py#L776 and pass arguments as keyword args to prepare for a smoother transition.

I am also worried about typer it self. The maintenance, see e.g. https://github.com/tiangolo/typer/issues/441 which would be applicable to us as well. tiangolo is very hard working, but he only has so much time, and FastAPI is very popular, his focus is likely there.

As it is, it is too much work... :(

Sep 22 '23 11:09 zerothi

For the record I tried something like this to get it to some state where I liked the help function:

diff --git a/src/sisl_toolbox/siesta/atom/_atom.py b/src/sisl_toolbox/siesta/atom/_atom.py
index d7b24ec2c..f77932f37 100644
--- a/src/sisl_toolbox/siesta/atom/_atom.py
+++ b/src/sisl_toolbox/siesta/atom/_atom.py
@@ -750,9 +750,25 @@ class AtomPlotOption(Enum):
     log = 'log'
     potential = 'potential'
 
+from typing import Union
+import click, typer
+#InputFile = TypeVar("File", str, Path)
+
+class InputFile:
+    def __init__(self, value: Union[str, Path]):
+        self.value = value
+    def __str__(self):
+        return f"<{self.__class__.__name__}: value={self.value}>"
+
+    @classmethod
+    def parse(cls, value):
+        """Parse argument to an input file"""
+        return cls(value)
+
 def atom_plot(
     plot: Annotated[Optional[List[AtomPlotOption]], CLIArgument()] = None, 
-    input: str = "INP", 
+    #input: str = "INP",
+    input: Annotated[InputFile, CLIOption(parser=InputFile.parse)] = "INP", 
     #plot: Optional[List[str]] = None, 
     l: str = 'spdf', 
     save: Annotated[Optional[str], CLIOption("-S", "--save")] = None, 
@@ -775,7 +791,7 @@ def atom_plot(
     """
     import matplotlib.pyplot as plt
 
-    input_path = Path(input)
+    input_path = Path(input.value)
     atom = AtomInput.from_input(input_path)
 
     # If the specified input is a file, use the parent

You'll note that the parse name gets duplicated in the help menu for the type. So the parser function name is directly used in the tooltip, a bit weird, but ok, not a problem per see.

Sep 22 '23 11:09 zerothi

I can't really figure this out. The idea of the toolboxes is that they are allowing external users to contribute toolboxes. And easily hook into the stoolbox CLI. If this gets too complicated, then that will never be adopted (it hasn't yet, so I don't have a good faith in this ever attracting attention, and using typer is a bit heavy for new users, documentation is limited, not a lot of help to get around).

I remember using argparse for the first time and honestly it was very confusing why I couldn't simply define a function and I had to instead do a lot of argparse calls in the global namespace. Therefore I don't think typer will be harder for new users (in fact I would say it would be easier).

I don't think using typer in toolboxes, but argparse in sisl is a good idea. Everything should be one CLI.

I don't understand why, since the code for them is clearly separated. sgeom and sdata are fairly straightforward and easy to understand, but the functionality in toolboxes is much more diverse and complex, so I think they would benefit much more from a more attractive CLI.

I have my basis optimizer CLI ready, but I'm hesitant to add it to the toolboxes with argparse, because I think the CLI is much more likely to be used if it is visually attractive.

I can't find any API of the typer.Option?

Yeah, I just look at the source code. That is the problem if the whole API isn't in the documentation ;)

But you only need these advanced arguments once, the users are most likely not going to need them.

Is there a particular reason for wrapping typer.Option with CLIOption?

The point was to move to another framework if it's better. Then the code is not full of typer imports. Indeed it currently just passes things directly to typer.Option, but the plan was to accept some arguments that are portable to other frameworks (e.g. argparse).

Also, your annotate_typer seems to not work when using typer.Option explicitly.

This could easily be fixed.

I can see you are updating the annotation in all cases, why is this required? :) Wouldn't it be fine if there are options not annotated? typer could handle this.

Because the annotate_typer function injects the help message for the parameter (coming from the docstring).

Oct 19 '23 10:10 pfebrer

tiangolo is very hard working, but he only has so much time, and FastAPI is very popular, his focus is likely there.

Yes, it also seems to me that he should delegate some mainteinance work to other people and accept more contributions from the community, because there are a bunch of useful PR that are just lying there without any comment on whether they will be accepted or not. It is a bit frustrating. But I think as it is now it is useful enough for the toolbox CLI, which needs something to make it more attractive to users.

Oct 19 '23 10:10 pfebrer

Ok, lets try things out. Would you mind having a go at translating them to typer?

Oct 31 '23 12:10 zerothi

The toolbox you mean?

Oct 31 '23 12:10 pfebrer

yes. :) Only sisl_toolbox stuff. :)

Oct 31 '23 12:10 zerothi

Ok :+1:

Oct 31 '23 12:10 pfebrer

I included the ts poisson CLI. I have nothing to test it with though :sweat_smile:

Here are the screenshots with the current state:

stoolbox

Screenshot from 2023-11-02 10-44-22

atom-plot

Screenshot from 2023-11-02 10-44-44

ts-fft

Screenshot from 2023-11-02 10-45-37

Nov 02 '23 09:11 pfebrer

I tried to be as faithful to the previous CLI as possible, but I had to change elec_V to a dict because List[Tuple[str, float]] is not allowed. So the way it would work now is:

stoolbox ts-fft -V "elec1: 4"

or

stoolbox ts-fft -V "{elec1: 4, elec2: 3}"

if there is more than one electrode.

Nov 02 '23 09:11 pfebrer

sisl sisl copied to clipboard

WIP: Converting `sisl_toolbox` to typer.

How it looks

How it works

sisl
sisl copied to clipboard