QANet-pytorch-
QANet-pytorch- copied to clipboard
This repo cannot reproduce the result of original paper.
Thank you for your implementation, it is very helpful for me. I run this code and can get the similar result when the number of heads equals to 1. But, I cannot get the result of original paper(73.6/82.7) when I use 8 heads, batch size 32, training step 150k, char dimension of 200 (the same setting as the original paper). I can only get around (71.27/80.58). Same situation was ocurred when I ran the tensorflow repo "NLPLearn/QANet"(https://github.com/NLPLearn/QANet).
Any suggestions?
What hidden size did you use? I have tried 96 and 128. 128 performs better. You can try tuning the hidden size.
Thank you for your reply. I used 128. All the configs are consistent with original paper. As you list in the repo, 128 performs better when batch size equals to 12. I have tried batch size 32, hidden size 128, I can only get around (71.27/80.58). The result is similar to your result when batch size equals to 12(70.7/80.0), but still has a big gap with the original paper(73.6/82.7). Did you try to reproduce the result of original paper?
The hyper parameters of the repository is mostly based on “NLPLearn/QANet”, so the results are similar. I have tried to reproduce the result of the paper. But with limited resources, this is the best performance I can get. Althougth, there is a repository published by the qanet authors.(https://github.com/tensorflow/tpu/tree/master/models/experimental/qanet) You can use this repository or try to tune the parameters based on this repository.
I will try it. Thanks!