GloVe icon indicating copy to clipboard operation
GloVe copied to clipboard

Cannot reproduce Numbers in paper for Word Similarity

Open joeybose opened this issue 7 years ago • 0 comments

I've tried for a very long time to take the released 6B token 300D word vectors to reproduce numbers as reported in the Glove Paper. But I simply cannot get the right number for the word similarity task on RW and WS353 datasets. There is one question, first how do you deal with the case if one of the words in the pair is OOV, do you compute the cosine distance as 0 or remove the sample completely. Regardless, without removing OOV words my score on RW dataset is 34.21, paper reports 38.1 EXCLUDING OOV score is 41.09. On WS353 my score 60.85. I have normalized the embeddings like your evaluation code for word analogies and used scipy.stats to compute spearman score. It would be great if I could reproduce the results reported on the paper.

joeybose avatar Feb 18 '18 19:02 joeybose