GloVe icon indicating copy to clipboard operation
GloVe copied to clipboard

Asymetric context and model = 1, leads to saving only context instead of word vec

Open urialon opened this issue 7 years ago • 9 comments

Hi, I'm a CS M.Sc. student from the Technion, Israel, researching applications of word vectors to programming language. Thanks for your code, it's great.

I think there is a bug, where GloVe saves only context word vectors without the word vectors. When training with symmetric=0, and model=1, the user expects (and the documentation says) that only left context is used, and finally only the word vectors will be saved (without context word vectors).

However, in cooccur.c, CREC.word1 gets the left word (the context, line 363) and CREC.word2 gets the right word (the target word, line 364). So indeed only the left context is saved in the CREC struct.

However, in glove.c, word1 and word2 fields are interpreted in the opposite way - CREC.word1 is treated as the target word (line 113), and CREC.word2 is treated as the context word (line 114). Finally, since model=1, the code attempts to save only the word vectors without the context (line 233), but the vectors that are actually saved are the original contexts, not target words.

What do you think? Thanks!

urialon avatar Dec 22 '16 06:12 urialon

@Russell91 - can you please take a look?

urialon avatar Dec 22 '16 19:12 urialon

I see the point that you're making. Do you have qualitative evidence when you use model=0, symmetric=0, that the semantics of the saved vectors are swapped, and that they should be switched?

ghost avatar Jan 14 '17 23:01 ghost

No, when you use model=0, it doesn't matter because the "word" vectors and the "context" vectors are concatenated, and therefore they lose their "role" (it doesn't matter which of them is appended first). It also doesn't matter for model=2.

The only problem is in model=1, when only the "word" vector is supposed to be saved, but instead only the "context" vector is saved.

I am aware that using (symmetric=0, model=1) is not a common usage, but I it's still a bug. Would you want me to send a pull request with a fix?

urialon avatar Jan 15 '17 07:01 urialon

and if so - would you rather me fixing it in the saving part of glove.c, or in the creation of the CREC struct in cooccur.c?

urialon avatar Jan 15 '17 20:01 urialon

No, when you use model=0, it doesn't matter because the "word" vectors and the "context" vectors are concatenated

Well, yes, but users in theory could be relying on the semantics of the, say, first 150 indices of that vector as different than the second 150, for example. It would be nice for the same reasons that the ordering of the vectors was correct. But I'm not entirely convinced that they are flipped. What evidence can you assemble (in terms of functionality) that they are coming in the wrong order aside from the unintuitive code snipped?

ghost avatar Jan 20 '17 00:01 ghost

Well, if the users rely on that, they are relying on something that contradicts the documentation! Anyway - we can fix it and at the same time keep the old ordering for model=0. Why is that related? We can keep the behavior of model=0 untouched.

I have no way to convince you that they are flipped other than just looking at the code thoroughly. This way or another - vectors will be produced. Probably the incorrect code also produces "good" vectors. So other than looking at the code and see what it does, or debugging step-by-step, how can I show you that it gives the wrong vectors?

Of course, I am aware that using (symmetric=0, model=1) is not a common usage in NLP. However, these days word vectors are sometimes used by models from different domains (for example - Node2vec by Grover and Leskovec from Standford take random walks in a graph and run Word2vec on the nodes). Therefore, we should expect different users using any unusual combination of parameters to GloVe. I also use unusual combinations of parameters myself in GloVe to research its usage with programming languages.

urialon avatar Jan 20 '17 07:01 urialon

Do you have something in mind about "evidence that I can assemble in terms of functionality"?

I mean - both the current version and my proposed fix will work, the current code just contradicts the original intent and documentation.

urialon avatar Jan 20 '17 07:01 urialon

@Russell91 what do you think? can I send a fix?

urialon avatar Jan 29 '17 04:01 urialon

@urialon I've got it on my list to spend an hour or two looking through the vectors and trying to confirm this myself. The doc change itself should be simple, but I'll make a pull request after and reference it here so you can take a look.

ghost avatar Jan 29 '17 07:01 ghost