word2vec
word2vec copied to clipboard
which parameter to change to run demo for n epochs?
Hello,
I had gone through all of the code. I could not get how to change the #epochs or #iter parameter and also #threads parameter like in Google-C code.
Can you give little glimpse on it.
I found n_workers parameter but by changing that parameter I could not get much more timing difference. Changing n_workers from 4 to 16 only improved 5 sec of training time.
Please let me know.
Thanks.
IIRC the code relies on OpenMP for parallel training so the n_workers param is probably not useful. If you compile with OpenMP support you should be able to use all the CPUs. You can take a look at the train() function to add extra loops to train more iterations.
Hello @jdeng, Thanks I changed the code for more number of iterations by adding a for loop. Now there are 3 nested for loops in the train function. I have few questions:
- Does one iteration means training over whole 17 Million words in text8 corpus?
- Also, wanted to know what batch size is being used in the code.
- Was your original github code training for just 1 iterations ?
- Does the concept #iteraions in original word2vec code same as this one?
https://github.com/svn2github/word2vec
Please, let me know.
Thanks.