word2vec-pytorch
word2vec-pytorch copied to clipboard
negative_samling
hi! pow_frequency = np.array(list(self.word_frequency.values())) ** 0.5
should not be pow_frequency = np.array(list(self.word_frequency.values())) ** 0.75
Hi!
It does not matter too much.
My implementation comes from fasttext:
void Model::initTableNegatives(const std::vector<int64_t>& counts) { real z = 0.0; for (size_t i = 0; i < counts.size(); i++) { z += pow(counts[i], 0.5); } for (size_t i = 0; i < counts.size(); i++) { real c = pow(counts[i], 0.5); for (size_t j = 0; j < c * NEGATIVE_TABLE_SIZE / z; j++) { negatives.push_back(i); } } std::shuffle(negatives.begin(), negatives.end(), rng); }