fuzzymatcher
fuzzymatcher copied to clipboard
Readme should explain meaning of scores
On what scale are the matches scored?
I noticed with fuzzymatcher.fuzzy_left_join
my best_match_score
ranges from -0.7 to + 1.15.
What is the highest possible score in this case? Can it go higher than 1.15?
Usually for fuzzy matching I would have a cutoff of around 0.8 or 0.9., which is on a scale of 0 to 1.
I ran the package again today and scores range from -1.4 to +2.5. I can't figure it out!
I agree, not sure how to read the scores.
Just to reiterate that it would be great to get an idea of what the scores mean so that a comparison could be made between various matching algorithms/libraries. I find this library vastly quicker for large data sets so it's shame that this is one of the main drawbacks.
how can we change the scorer to return the true probability of a match??