glove-python
glove-python copied to clipboard
Added a distance method to the Glove class
Added a distance method to the Glove class to measure the distance between two arbitrary words. I needed the functionality for my own project and thought it would be useful for this project as a whole. Let me know if there are any other changes that may need to be done.
Looks good at first glance. I'll merge when I have access to a computer. On 10 Dec 2014 00:02, "Ossama W. Obeid" [email protected] wrote:
Added a distance method to the Glove class to measure the distance between two arbitrary words. I needed the functionality for my own project and thought it would be useful for this project as a whole.
Let me know if there are any other changes that may need to be done.
You can merge this Pull Request by running
git pull https://github.com/owo/glove-python master
Or view, comment on, or merge it at:
https://github.com/maciejkula/glove-python/pull/19 Commit Summary
- Added a distance method to the Glove class to measure the distance between two arbitrary words.
File Changes
- M glove/glove.py https://github.com/maciejkula/glove-python/pull/19/files#diff-0 (31)
Patch Links:
- https://github.com/maciejkula/glove-python/pull/19.patch
- https://github.com/maciejkula/glove-python/pull/19.diff
— Reply to this email directly or view it on GitHub https://github.com/maciejkula/glove-python/pull/19.
A couple of comments:
- Could you encapsulate checking whether the model has been fitted into its own method so that we could reuse it in both your new method and
most_similar
? You could call this_check_model_fitted
or something along those lines. - If I understand correctly, what you are actually returning is the similarity of two words, not their distance. The distance would be 1 - that? You could rename your method to
similarity
?
Sure thing. I'll commit the changes as soon as I can.
@maciejkula one question. Would you like to have _check_model_fitted to check if a word dictionary has been provided as well?
Good question. Maybe just leave the checks as you wrote them for the moment and I will have a look later and see if there's a non-awful solution.
Okay, so I decided to retain the distance function for now, by subtracting from 1 the computed similarity. I don't know, distance seems more useful (or intuitive, rather) than similarity when comparing two particular words (at least for my use cases).
I must say I prefer to have the similarity function, it seems to me to be more consistent with the most_similar
method.
Done :)
We can address the KeyError
thing in a separate PR (but I agree that a KeyError
is probably better).
Could you just check my comment above? And then maybe I'll finally merge!
:+1: