company2vec
company2vec copied to clipboard
How does vector embedding perform compared to fuzzy string matching?
How does vector embedding perform compared to fuzzy string matching?
I’m exploring the usage of string matching like levenshtein distance, Jaro-Winkler among others.
Vector embedding seems similar in terms of addressing the issue of string similarity.
Bit is it the case that I may need to have access to good pre-trained models?
For example my use case is on identifying similar pharmacy names and their adresses.
I may have two different files and contain duplicates of pharmacy+address combinations. But the way these pharmacies and their respective adresses may have been wrongfully inputed.
Hey @lejarx, have you tried it ? I'm having a similar use case :)