glove-python icon indicating copy to clipboard operation
glove-python copied to clipboard

Added a distance method to the Glove class

Open owo opened this issue 9 years ago • 10 comments

Added a distance method to the Glove class to measure the distance between two arbitrary words. I needed the functionality for my own project and thought it would be useful for this project as a whole. Let me know if there are any other changes that may need to be done.

owo avatar Dec 09 '14 17:12 owo

Looks good at first glance. I'll merge when I have access to a computer. On 10 Dec 2014 00:02, "Ossama W. Obeid" [email protected] wrote:

Added a distance method to the Glove class to measure the distance between two arbitrary words. I needed the functionality for my own project and thought it would be useful for this project as a whole.

Let me know if there are any other changes that may need to be done.

You can merge this Pull Request by running

git pull https://github.com/owo/glove-python master

Or view, comment on, or merge it at:

https://github.com/maciejkula/glove-python/pull/19 Commit Summary

  • Added a distance method to the Glove class to measure the distance between two arbitrary words.

File Changes

  • M glove/glove.py https://github.com/maciejkula/glove-python/pull/19/files#diff-0 (31)

Patch Links:

  • https://github.com/maciejkula/glove-python/pull/19.patch
  • https://github.com/maciejkula/glove-python/pull/19.diff

— Reply to this email directly or view it on GitHub https://github.com/maciejkula/glove-python/pull/19.

maciejkula avatar Dec 10 '14 03:12 maciejkula

A couple of comments:

  1. Could you encapsulate checking whether the model has been fitted into its own method so that we could reuse it in both your new method and most_similar? You could call this _check_model_fitted or something along those lines.
  2. If I understand correctly, what you are actually returning is the similarity of two words, not their distance. The distance would be 1 - that? You could rename your method to similarity?

maciejkula avatar Dec 19 '14 11:12 maciejkula

Sure thing. I'll commit the changes as soon as I can.

owo avatar Dec 20 '14 12:12 owo

@maciejkula one question. Would you like to have _check_model_fitted to check if a word dictionary has been provided as well?

owo avatar Dec 20 '14 14:12 owo

Good question. Maybe just leave the checks as you wrote them for the moment and I will have a look later and see if there's a non-awful solution.

maciejkula avatar Dec 23 '14 11:12 maciejkula

Okay, so I decided to retain the distance function for now, by subtracting from 1 the computed similarity. I don't know, distance seems more useful (or intuitive, rather) than similarity when comparing two particular words (at least for my use cases).

owo avatar Dec 26 '14 13:12 owo

I must say I prefer to have the similarity function, it seems to me to be more consistent with the most_similar method.

maciejkula avatar Dec 29 '14 11:12 maciejkula

Done :)

owo avatar Jan 01 '15 14:01 owo

We can address the KeyError thing in a separate PR (but I agree that a KeyError is probably better).

Could you just check my comment above? And then maybe I'll finally merge!

maciejkula avatar Jan 02 '15 21:01 maciejkula

:+1:

theeluwin avatar Sep 09 '16 07:09 theeluwin