text2vec icon indicating copy to clipboard operation
text2vec copied to clipboard

Norvig spell corrector

Open dselivanov opened this issue 8 years ago • 1 comments

Taken from #73:

dselivanov avatar Sep 26 '16 11:09 dselivanov

As a side note / hint to spell checking: just stumbled over the ropensci/hunspell package. Have not digged into the details of the implementation, but the basic idea is that it checks which affixes and word stems are allowed in a certain language and checks a text against the entries in a dictionary (which can be taken, e.g., from LibreOffice) - more details in package doc, e.g., in hunspell.R. Hence, if my understanding is correct, the hunspell approach is less probabilistic than the one of Norvig, which allows to easily use own training data, but might still be useful depending on the task to be solved since existing dictionaries can directly be used. Might be worth comparing the quality of results between the both approaches (if anyone finds the time...).

manuelbickel avatar Nov 12 '18 09:11 manuelbickel