jsonargparse icon indicating copy to clipboard operation
jsonargparse copied to clipboard

Path_fr does not use the default value during validation

Open indigoviolet opened this issue 2 years ago • 6 comments

🐛 Bug report

If I specify a default value for a Path_fr parameter, the validation ignores it. This is pretty weird because clearly the value is present and used if I don't turn on this path_type validation.

To reproduce

from jsonargparse import CLI
from pathlib import Path

from jsonargparse.typing import Path_fr


def main(path: Path_fr = Path("test.py")):
    print(path)


if __name__ == "__main__":
    CLI(main, as_positional=False)
❯ python test.py --path test.py                               
test.py

❯ python test.py --help        
usage: test.py [-h] [--config CONFIG] [--print_config[=flags]] [--path PATH]

<function main at 0x7fc4001080d0>

optional arguments:
  -h, --help            Show this help message and exit.
  --config CONFIG       Path to a configuration file.
  --print_config[=flags]
                        Print the configuration after applying all other arguments and exit. The
                        optional flags customizes the output and are one or more keywords separated by
                        comma. The supported flags are: comments, skip_default, skip_null.
  --path PATH           (type: Path_fr, default: test.py)

❯ python test.py               
usage: test.py [-h] [--config CONFIG] [--print_config[=flags]] [--path PATH]
test.py: error: Configuration check failed :: Parser key "path":
  Not of type Path_fr: Expected path to be a string or a Path object.

Expected behavior

The validation should happen after applying the defaults.

Environment

  • jsonargparse version: 4.19.0
  • Python version (e.g., 3.9): 3.9.15
  • How jsonargparse was installed (e.g. pip install jsonargparse[all]): pip install jsonargparse
  • OS (e.g., Linux): Linux

indigoviolet avatar Feb 08 '23 07:02 indigoviolet

Also, bonus report: pyright reports that Path_fr etc are unknown import symbols in jsonargparse.typing

indigoviolet avatar Feb 08 '23 07:02 indigoviolet

Thank you for reporting!

This behavior is as expected and it is not a bug. Path_fr is not a subclass of pathlib.Path, thus path: Path_fr = Path("test.py") is wrong, since the type hint does not match the type of the default. Maybe the error message can be improved so that it is clear that Path refers to a class from jsonargparse and not from pathlib. Would you like to create a pull request to improve the error message?

It is unfortunate that pyright gets confused with Path_fr. It is defined (typing.py#L355) and it is even listed in __all__ (typing.py#L27).

path: Path_fr = Path_fr("test.py") would be correct, however, this forces the file test.py to exist in the current directory when that module is loaded. Path_fr as type by itself is intended more for required parameters (without a default), so that the parser makes sure that a file exists and fails early when the user specifies a non-existent file. If you truly want a default, then it could be path: Union[Path_fr, pathlib.Path] = pathlib.Path("test.py").

mauvilsa avatar Feb 09 '23 09:02 mauvilsa

Thanks for the explanation.

My counterargument would be that Path_fr and other path_type should consider Paths as reasonable values (I'm agnostic to whether that is achieved by being a subclass of pathlib.Path or not); otherwise it is a source of confusion and sharp edges.

Compare for example, the os.PathLike protocol that is used by os and other stdlib functions to accept str, bytes as well as Path.

As to the other point, I feel that whatever applies to user-supplied values should also work for defaults; including checking that the default file exists and is readable.

The other confusing thing about the above messages is that jsonargparse claims that "test.py" is the default value for --path, but fails upon running it.

Of course, I can continue to ignore this feature if you feel strongly otherwise, but I do think it will integrate better and be more intuitive if it is compatible with Path/default values.

indigoviolet avatar Feb 09 '23 23:02 indigoviolet

There are several reasons why it shouldn't be like this, maybe just not explained well enough:

  • In python when a class is used as type hint, the default value must be an instance of that class or a subclass. jsonargparse tries to guide people to write good code, so it will not implement behaviors that contradict correct typing.

  • It is impossible to make pathlib.Path a subclass of the jsonargparse path types. Thus, path: Path_fr = Path("test.py") will never be accepted by mypy/pyright, even if the unknown import symbol is fixed.

  • The purpose of Path_fr is to guarantee that it points to a file that exists and is readable. If the default is pathlib.Path then there is no such guarantee. It is misleading to do this.

Note that the class/subclass requirement applies to os.PathLike and str. It is okay to have path: os.PathLike = pathlib.Path(".") and not okay path: os.PathLike = ".". Furthermore, in typeshed you can see that for stdlib functions, the type hints are like str | PathLike. Search for StrPath in the typeshed github. Which is similar to why I suggested to use as type Union[Path_fr, pathlib.Path] to make the typing correct.

Also note that the jsonargparse types implement the os.PathLike protocol. Thus, path: os.PathLike = Path_drw(".") would be correct analogous to path: os.PathLike = pathlib.Path(".").

As to the other point, I feel that whatever applies to user-supplied values should also work for defaults; including checking that the default file exists and is readable.

The other confusing thing about the above messages is that jsonargparse claims that "test.py" is the default value for --path, but fails upon running it.

What could be done is to also validate the defaults. If this is done, then the --help would fail because of the invalid default.

mauvilsa avatar Feb 10 '23 18:02 mauvilsa

I agree with your point about not breaking typing, and it makes sense from that perspective that Path_fr cannot receive Path values.

I would like the defaults to be validated, that would both point me towards setting default values of type Path_fr and be consistent with the command line arguments.

What had confused me about the Path_fr parameter accepting Path arguments was that Path_fr accepts str from the command line, so I was expecting a similar casting behavior from the defaults -- but I agree that the typing principle supercedes that.

Thank you for the detailed explanation.

indigoviolet avatar Feb 10 '23 19:02 indigoviolet

I would like the defaults to be validated, that would both point me towards setting default values of type Path_fr and be consistent with the command line arguments.

One issue with validating defaults is that parsers could be created from signatures of third party code. If the third party code has an invalid default, then there shouldn't be a blocking failure since it might not be possible to change that code. Maybe when a default does not agree with the type, there would be a warning. Also this warning would be intended for CLI developers, not for the CLI end users. So not yet sure how to do this.

mauvilsa avatar Feb 21 '23 22:02 mauvilsa