PolyFuzz
PolyFuzz copied to clipboard
I don't understand what is returned
The docs provide the following example:
from polyfuzz import PolyFuzz
from_list = ["apple", "apples", "appl", "recal", "house", "similarity"]
to_list = ["apple", "apples", "mouse"]
model = PolyFuzz("TF-IDF").match(from_list, to_list)
which returns
>>> model.get_matches()
From To Similarity
0 apple apple 1.000000
1 apples apples 1.000000
2 appl apple 0.783751
3 recal None 0.000000
4 house mouse 0.587927
5 similarity None 0.000000
I don't understand what is returned. If polyfuzz is doing pairwise comparion from from
list to to
list, it should return len(from)*len(to) rows. So 18 results in this case.
In the example, apple->apples, apple->mouse, apples->apple, apples->mouse and so on are missing.