udar
udar copied to clipboard
Implement python interface to hfst-tokenize instead of subprocess
http://giellatekno.uit.no/doc/ling/preprocessor.html
Currently, this is implemented using a subprocess because it does not appear that this has been implemented in the python API for hfst. I've added an issue on hfst's github page to see what it would take to get it available inside python.
Until this functionality is added, the command-line version of hfst must be installed for tokenization to work.