company2vec icon indicating copy to clipboard operation
company2vec copied to clipboard

How does vector embedding perform compared to fuzzy string matching?

Open lejarx opened this issue 4 years ago • 1 comments

How does vector embedding perform compared to fuzzy string matching?

I’m exploring the usage of string matching like levenshtein distance, Jaro-Winkler among others.

Vector embedding seems similar in terms of addressing the issue of string similarity.

Bit is it the case that I may need to have access to good pre-trained models?

For example my use case is on identifying similar pharmacy names and their adresses.

I may have two different files and contain duplicates of pharmacy+address combinations. But the way these pharmacies and their respective adresses may have been wrongfully inputed.

lejarx avatar Apr 24 '20 06:04 lejarx

Hey @lejarx, have you tried it ? I'm having a similar use case :)

griseau avatar Jul 13 '21 10:07 griseau