customized-symspell
customized-symspell copied to clipboard
lookupCompound() doesn't allow to look for 2 correctly spelled terms with only missed space
Precondition: SpellCheckSettings is initiated with maxEditDistance > 0.
I want to separately cover corner case with missed space, but only between correct words (maxEditdistance=0 for each word separately). But it's impossible to do with the same SymSpell if it was created with SpellCheckSettings with maxEditDistance > 0.
To cover the case with missed space, lookupCompound() has method lookupSplitWords(). Inside it split a word into part1 and part2. For each lookup() is called. It has the following code:
if (maxEditDistance <= 0) {
maxEditDistance = spellCheckSettings.getMaxEditDistance();
}
Now the scenario: query: {applewatch}
Scenario: I want to lookup for missed space between only correctly spelled words, which means maxEditDistance = 1 (missed space).
With the current implementation, SymSpell will look for extra space between 2 words with additional edit distance by 1. And there is no way to prevent this. Total maxEditDistance: lookup(part1, maxEditDistance) = 1 lookup(part2, maxEditDistance) = 1 lookupCompaund(part1+part1) = 1 Total = 3
Depending on the dictionary following results are possible: {apple watch}, editDistance = 1 {apple patch}, editDistance = 2 (if watch is not present in the dictionary) {apply patch}, editDistance = 3 (if both apple and watch is not present in the dictionary)