scspell Fix tokenising when using using more than just a-zA-Z

Fix tokenising when using using more than just a-zA-Z

Open robotdana opened this issue 6 years ago • 2 comments

Previously: Händler would be tokenized as ndler or ändler depending on python version Rather than the expected händler

Solution: use regexp rather than re. This gives us the ability to use unicode character clasess such as [[:upper:]] and [[:lower:]]

Fixes #35

I'm usually a ruby developer not a python developer I don't know how to get the regex library working on 2.7 or how to compare the test strings in a unicode-aware way (they're different on my mac vs on travis, if one passes the other fails)

But it mostly works

Nov 30 '18 02:11 robotdana

Thanks! I haven't tried the regex module before. I'll take a look when I have more time.

Dec 23 '18 18:12 myint

If you're interested, i took the really long way round fixing this by creating my own spell checker https://github.com/robotdana/spellr

Sep 22 '19 09:09 robotdana

scspell scspell copied to clipboard

Fix tokenising when using using more than just a-zA-Z

scspell
scspell copied to clipboard