german umlaut characters break spell checking

Open jnweiger opened this issue 9 years ago • 0 comments

Words with german umlaut characters e.g. 'Nürnberg' internally seen as "N\xfcrberg" do not make it into the word_set for h.check_word()

This is prevented by the regexp '([a-z_-]{3,})', but leads to encode/decode errors otherwise. We should try to guess an encoding and if successful convert to utf-8 for hunspell.

Apr 18 '16 19:04 jnweiger