ekphrasis
ekphrasis copied to clipboard
spelling correction mostly is not working
Came to this project for spelling in twitter text, but it doesn't quite work most of the time.
- spell correction seems to only work when
annotateis set as in the example. Now take the same example and setannotate={}and spell correction is gone:
i saw the new john doe movie and it suuuuucks ! ! ! waisted <money> . . . bad movies <annoyed>
if I restore annotate={"hashtag", "...}, then it corrects suuuuucks to sucks
I'm not sure what is the connection between annotations and spell correction.
- spelling-correction doesn't work in general. Again, going back to your pipeline example, change the first input sentence to inject some spelling errors:
CANT WAIT for the neww seaason of #TwinPeaks, run it, you get:cant wait for the neww seaason of twin peaks- i.e. no spell correction. Thespell_correct_elongdoesn't seem to make a difference.
Yet, if I run:
from ekphrasis.classes.spellcorrect import SpellCorrector
sp = SpellCorrector(corpus="english")
print([sp.correct(x) for x in "neww seaason".split()])
It corrects: ['new', 'season']