CS224n-2019-solutions
CS224n-2019-solutions copied to clipboard
Is it wrong at line 147 of word2vec.py, assigment 2?
I think the code "gradOutsideVecs[negSampleWordIndices] += np.outer((z-1),centerWordVec)*(-1)", at line 147 of word2vec.py, assigment 2, is not right. Because as the annotation says, "Note: The same word may be negatively sampled multiple times. For example if an outside word is sampled twice, you shall have to double count the gradient with respect to this word." I write testing codes, gradOutsideVecs[negSampleWordIndices] could not double the gradient, it only add one time(the last one).
It should be
for i, negSampleWordIdx in enumerate(negSampleWordIndices):
gradOutsideVecs[negSampleWordIdx] += (1.0 - z[i]) * centerWordVec
really help! @Lalalaashen @rockqgf
Use np.add.at() to avoid for loop, lets keep integrity of vectorization ^^. kudos.
Replace line 147 with:
np.add.at(gradOutsideVecs, negSampleWordIndices, np.outer((z-1),centerWordVec)*(-1))