HFSTTokenizer chokes on input longer than 550(?) characters
The interactive shell (accessed using pexpect) appears to limit line lengths over 550 (not really sure about this number) characters. If more are given, then bell characters (ascii codepoint 7, displayed as ^G in less) are printed to the logfile and pexpect hangs because it gets no output.
Submitted issue to HFST about this: https://github.com/hfst/hfst/issues/483.
The maximum buffer size appears to be 1024 bytes, so a workaround could check len(bytes(input_str, encoding='utf8')) < 1000, and use a regular subprocess to process that string. This check shouldn't be too expensive.
Workaround implemented in 765a2afb7d95d83b8bb179efe678fbd68e0d90fa.