udar icon indicating copy to clipboard operation
udar copied to clipboard

Implement python interface to hfst-tokenize instead of subprocess

Open reynoldsnlp opened this issue 6 years ago • 1 comments

http://giellatekno.uit.no/doc/ling/preprocessor.html

reynoldsnlp avatar Mar 06 '19 18:03 reynoldsnlp

Currently, this is implemented using a subprocess because it does not appear that this has been implemented in the python API for hfst. I've added an issue on hfst's github page to see what it would take to get it available inside python.

Until this functionality is added, the command-line version of hfst must be installed for tokenization to work.

reynoldsnlp avatar May 29 '19 03:05 reynoldsnlp