europarl-extract
europarl-extract copied to clipboard
Hungarian nonbreaker file for ixa-pipe
When using the ixa-pipe-tokeniser, it throws an error about not having a Hungarian nonbreaker file. It appears all the nonbreakers inside the ixa-pipe-tok-1.8.4.jar have the format en-nonbreaker.txt while Hungarian is the only one with hu_nonbreaker.txt
See the ixa-pipe-tok-1.8.4.jar in attached .zip for the version that worked for me. ixa-pipe-tok-1.8.4.zip