sudachi.rs icon indicating copy to clipboard operation
sudachi.rs copied to clipboard

SudachiPy: No default "system dictionary path" for the Build User Dictionary subcommand

Open sorami opened this issue 3 years ago • 1 comments

In the help text, it says that -s file system dictionary path (default: system core dictionary path);

$ sudachipy ubuild --help
usage: sudachipy ubuild [-h] [-d string] [-o file] [-s file] file [file ...]

Build User Dictionary

positional arguments:
  file        source files with CSV format (one or more)

optional arguments:
  -h, --help  show this help message and exit
  -d string   description comment to be embedded on dictionary
  -o file     output file (default: user.dic)
  -s file     system dictionary path (default: system core dictionary path)

However, an error occurs if you do not specify the system dictionary path;

sudachipy ubuild user.txt
Traceback (most recent call last):
  File "/Users/shisamoto/.pyenv/versions/3.9.7/bin/sudachipy", line 8, in <module>
    sys.exit(main())
  File "/Users/shisamoto/.pyenv/versions/3.9.7/lib/python3.9/site-packages/sudachipy/command_line.py", line 270, in main
    args.handler(args, args.print_usage)
  File "/Users/shisamoto/.pyenv/versions/3.9.7/lib/python3.9/site-packages/sudachipy/command_line.py", line 169, in _command_user_build
    system = Path(args.system_dic)
  File "/Users/shisamoto/.pyenv/versions/3.9.7/lib/python3.9/pathlib.py", line 1082, in __new__
    self = cls._from_parts(args, init=False)
  File "/Users/shisamoto/.pyenv/versions/3.9.7/lib/python3.9/pathlib.py", line 707, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/Users/shisamoto/.pyenv/versions/3.9.7/lib/python3.9/pathlib.py", line 691, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

It works fine if you specify the path, e.g., sudachipy ubuild user.txt -s system_core.dic.


In the pre-Rust version, it worked fine because there was a function to set the system dict path from the default config file;

https://github.com/WorksApplications/SudachiPy/blob/b244110e1f497763879fcfe9de48f5af3544897d/sudachipy/command_line.py#L80

def _system_dic_checker(args, print_usage):
    if not args.system_dic:
        settings.set_up()
        args.system_dic = settings.system_dict_path()
    if not os.path.exists(args.system_dic):
        print_usage()
        print('{}: error: {} doesn\'t exist'.format(__name__, args.system_dic), file=sys.stderr)
        exit(1)

...

def _command_user_build(args, print_usage):
    _system_dic_checker(args, print_usage)
...

I am not sure if the similar procedure is possible with the Rust version, without too much modification.


Maybe the easiest way to handle this issue is to just set the -s option to be essential;

https://github.com/WorksApplications/sudachi.rs/blob/2e6157455210271bb6eb30f1566db4e60bfcf964/python/py_src/sudachipy/command_line.py#L258

parser_ubd.add_argument('-s', dest='system_dic', metavar='file', required=True,
                        help='system dictionary path')

(Change to required=True, remove (default: system core dictionary path) from the help text)

sorami avatar Feb 18 '22 16:02 sorami

Making the system dictionary required is probably the best way to solve this problem. Using some default dictionary as an anchor for building a user dictionary can result in surprising behavior in my opinion.

eiennohito avatar Feb 19 '22 01:02 eiennohito