pyspellchecker
pyspellchecker copied to clipboard
load_words is not prioritized
Looks like the functionality load_words is not prioritized in the spellchecking.
from spellchecker import SpellChecker
known_words = ['covid', 'Covid19']
spell = SpellChecker(language='en')
spell.word_frequency.load_words(known_words)
word = 'coved'
misspelled = spell.unknown(word)
print(spell.correction(allwords))
the output of this is loved
You are correct, they are "prioritized" based on the number of instances that are found as the more common words are more likely to be the correct word (hence why it is called a frequency). You can help boost the newer words by doing something like this:
from spellchecker import SpellChecker
known_words = ['covid', 'Covid19'] * 1000
spell = SpellChecker(language='en')
spell.word_frequency.load_words(known_words)
Or you could use a different method:
from spellchecker import SpellChecker
known_words = {'covid': 1000, 'Covid19': 10000}
spell = SpellChecker(language='en')
spell.word_frequency.load_dictionary(known_words)