word2vec-pytorch icon indicating copy to clipboard operation
word2vec-pytorch copied to clipboard

Extremely simple and fast word2vec implementation with Negative Sampling + Sub-sampling

Results 8 word2vec-pytorch issues
Sort by recently updated
recently updated
newest added

Why add (t/f) in this formula for discards: ``` t = 0.0001 f = np.array(list(self.word_frequency.values())) / self.token_count self.discards = np.sqrt(t / f) + (t / f) ```

https://github.com/Andras7/word2vec-pytorch/blob/36b93a503e8b3b5448abbc0e18f2a6bd3e017fc9/word2vec/data_reader.py#L102 I think `i + boundary` should include a `+ 1` to make it inclusive, otherwise the right context takes 1 token less in the resulting skipgrams.

Hi @Andras7, first I want to thank you for providing this code, it really is a big help. I ran into this error when trying to train the model: Traceback...

hi! pow_frequency = np.array(list(self.word_frequency.values())) ** 0.5 should not be pow_frequency = np.array(list(self.word_frequency.values())) ** 0.75

It is not an issue. I just want to ask why do you use running_loss = running_loss*0.9 + loss.item()*0.1 for monitoring the loss during training? Do you have any special...

Hello Andras7, Thank you for this fast and effective implementation of word2vec. We have forked your repository and added some augmentations for a research project, and would like to properly...

Hi @Andras7 I reorganized your code a little bit to make it easily installable with pip.. You can install it with: ` pip install git+https://github.com/marta-sd/word2vec-pytorch.git` You can take a look...

Hello!Thanks for your code! Have you observed the loss? I had loaded the code and executed it. However, the loss didn't seems to be convergent. It descends rapidly at first,...