CS224n-2019-solutions icon indicating copy to clipboard operation
CS224n-2019-solutions copied to clipboard

Is it wrong at line 147 of word2vec.py, assigment 2?

Open rockqgf opened this issue 6 years ago • 3 comments

I think the code "gradOutsideVecs[negSampleWordIndices] += np.outer((z-1),centerWordVec)*(-1)", at line 147 of word2vec.py, assigment 2, is not right. Because as the annotation says, "Note: The same word may be negatively sampled multiple times. For example if an outside word is sampled twice, you shall have to double count the gradient with respect to this word." I write testing codes, gradOutsideVecs[negSampleWordIndices] could not double the gradient, it only add one time(the last one).

rockqgf avatar Oct 09 '19 14:10 rockqgf

It should be

for i, negSampleWordIdx in enumerate(negSampleWordIndices):
        gradOutsideVecs[negSampleWordIdx] += (1.0 - z[i]) * centerWordVec

Lalalaashen avatar Nov 26 '19 07:11 Lalalaashen

really help! @Lalalaashen @rockqgf

Zhuifeng414 avatar May 11 '20 01:05 Zhuifeng414

Use np.add.at() to avoid for loop, lets keep integrity of vectorization ^^. kudos.

Replace line 147 with: np.add.at(gradOutsideVecs, negSampleWordIndices, np.outer((z-1),centerWordVec)*(-1))

naduhrin78 avatar Jun 30 '20 11:06 naduhrin78