fuzzymatcher icon indicating copy to clipboard operation
fuzzymatcher copied to clipboard

Readme should explain meaning of scores

Open soliverc opened this issue 5 years ago • 4 comments

On what scale are the matches scored?

I noticed with fuzzymatcher.fuzzy_left_join my best_match_scoreranges from -0.7 to + 1.15.

What is the highest possible score in this case? Can it go higher than 1.15?

Usually for fuzzy matching I would have a cutoff of around 0.8 or 0.9., which is on a scale of 0 to 1.

soliverc avatar Aug 28 '19 15:08 soliverc

I ran the package again today and scores range from -1.4 to +2.5. I can't figure it out!

soliverc avatar Aug 30 '19 15:08 soliverc

I agree, not sure how to read the scores.

ghost avatar Jan 15 '20 21:01 ghost

Just to reiterate that it would be great to get an idea of what the scores mean so that a comparison could be made between various matching algorithms/libraries. I find this library vastly quicker for large data sets so it's shame that this is one of the main drawbacks.

Kreisash avatar Aug 06 '20 11:08 Kreisash

how can we change the scorer to return the true probability of a match??

bobcolner avatar Jan 12 '23 01:01 bobcolner