word2vec-pytorch
word2vec-pytorch copied to clipboard
SubSampling formula
Why add (t/f) in this formula for discards:
t = 0.0001
f = np.array(list(self.word_frequency.values())) / self.token_count
self.discards = np.sqrt(t / f) + (t / f)