languagetool icon indicating copy to clipboard operation
languagetool copied to clipboard

Wikipedia tool issue with UTF-8

Open marcoagpinto opened this issue 9 months ago • 0 comments

Heya,

I know it is 5am and everyone is sleeping, but I have been working on LanguageTool.

I have faced an issue with: java -Dfile.encoding=UTF-8 -Xmx4500M -jar languagetool-wikipedia.jar check-data -l pt-PT -r PÔR_FIM_À_VIDA -f pt-BR.txt --max-sentences 900000 --context-size 100 >0.txt

It seems the accents become all messed up:

WARNING: Could not find rule 'PÔR_FIM_À_VIDA'
Only these rules are enabled: [PÔR_FIM_À_VIDA]
Working on: pt-BR.txt
Sentence limit: 900000
Context size: 100
Error limit: no limit
Skip: 0

I have been using: LanguageTool-wikipedia-20240426-snapshot

Is it a known issue? Maybe just a matter of updating to a more recent version?

Thanks!

EDIT: Ahhhh… I have Windows 11.

marcoagpinto avatar May 04 '24 05:05 marcoagpinto