Near matches get lost with increasing values of max_l_dist
To reproduce I am using fuzzysearch==0.7.3 and running
text = "foo bar spam eggs "
query = "four"
with max_l_dist=2 I get one match with
fuzzysearch.find_near_matches(query, text, max_l_dist=2)
[Match(start=0, end=4, dist=2, matched='foo ')]
with max_l_dist=3 I get the previous one with an additional one
fuzzysearch.find_near_matches(query, text, max_l_dist=3)
[Match(start=0, end=4, dist=2, matched='foo '),
Match(start=6, end=7, dist=3, matched='r')]
but with max_l_dist=4 I fail to get previous ones.
fuzzysearch.find_near_matches(query, text, max_l_dist=4)
[Match(start=0, end=0, dist=4, matched=''),
Match(start=1, end=1, dist=4, matched=''),
Match(start=2, end=2, dist=4, matched=''),
Match(start=3, end=3, dist=4, matched=''),
Match(start=4, end=4, dist=4, matched=''),
Match(start=5, end=5, dist=4, matched=''),
Match(start=6, end=6, dist=4, matched=''),
Match(start=7, end=7, dist=4, matched=''),
Match(start=8, end=8, dist=4, matched=''),
Match(start=9, end=9, dist=4, matched=''),
Match(start=10, end=10, dist=4, matched=''),
Match(start=11, end=11, dist=4, matched=''),
Match(start=12, end=12, dist=4, matched=''),
Match(start=13, end=13, dist=4, matched=''),
Match(start=14, end=14, dist=4, matched=''),
Match(start=15, end=15, dist=4, matched=''),
Match(start=16, end=16, dist=4, matched=''),
Match(start=17, end=17, dist=4, matched=''),
Match(start=18, end=18, dist=4, matched='')]
Is this intended behaviour?
Hi @davidefiocco, apologies for the late response.
Yes, this is currently the intended behavior.
The reason is that once the maximum distance is equal to (or greater than) the length of what you're searching for (query in your example), even an empty string is a valid match.
However, looking at your example, I can see that this behavior isn't great: There are matches with a lower distance in the text, but these are no longer returned when the max. distance is too large.
I'll think about how this can be improved without complicating things.