Swind D.C. Xu
Results
3
comments of
Swind D.C. Xu
Thank you for your reply. I used 128. All the configs are consistent with original paper. As you list in the repo, 128 performs better when batch size equals to...
I will try it. Thanks!
I tried my best, but the result was still two points below.