Perf
Perf copied to clipboard
Issues on documentation of word2vec training
Hi, I have some questions about https://github.com/PaddlePaddle/Perf/blob/master/Word2Vec/readme.md
- The command to run on single machine should be
python -u ../../../../tools/static_ps_trainer.py -m benchmark.yaml
right. Currently it is../../../
- After I run the command above, the training starts, but single machine seems need long time to finish, where could I set iterations? Or I must wait at least one epoch to finish. Thank you very much! @MrChengmo @luotao1
Hi, https://github.com/PaddlePaddle/PaddleRec/tree/master/models/recall/word2vec static model training command python -u ../../../tools/static_trainer.py -m config.yaml
is also out of date because static_trainer.py is deleted.
Should we use PaddleRec branch 2.0.0? But I saw last commit of 2.0.0 is Jan. Should we still use develop version? Thank you very much !
Hi, I have some questions about https://github.com/PaddlePaddle/Perf/blob/master/Word2Vec/readme.md
- The command to run on single machine should be
python -u ../../../../tools/static_ps_trainer.py -m benchmark.yaml
right. Currently it is../../../
- After I run the command above, the training starts, but single machine seems need long time to finish, where could I set iterations? Or I must wait at least one epoch to finish. Thank you very much! @MrChengmo @luotao1
- Please refer to this link:https://github.com/PaddlePaddle/Perf/tree/master/Word2Vec
- The code for the recurrence effect is located in: https://github.com/PaddlePaddle/PaddleRec/tree/master/models/recall/word2vec/benchmark
- A round of full data training takes more than 40 hours,if you only want to test the performance, it can be quickly tested on small samples
Hi, https://github.com/PaddlePaddle/PaddleRec/tree/master/models/recall/word2vec static model training command
python -u ../../../tools/static_trainer.py -m config.yaml
is also out of date because static_trainer.py is deleted. Should we use PaddleRec branch 2.0.0? But I saw last commit of 2.0.0 is Jan. Should we still use develop version? Thank you very much !
We recommend using the master branch code for training
where could I set iterations
https://github.com/PaddlePaddle/PaddleRec/blob/master/models/recall/word2vec/benchmark/benchmark.yaml#L26
runner:
epochs: 15
print_interval: 100
A round of full data training takes more than 40 hours
It means that each epoch takes more than 40 hours.
it can be quickly tested on small samples
You can select small samples among full data to test accuracy or performance.