OR-NMT icon indicating copy to clipboard operation
OR-NMT copied to clipboard

Can't reproduce the results on WMT En-De

Open Paulmzr opened this issue 3 years ago • 2 comments

Hi, all. I am trying to reproduce the word-level oracle results of this paper on WMT EN-DE dataset. I train the transformer model 150k steps with #Gpus = 4, #Freq = 2, #Toks = 4096 and save the checkpoints every 5k steps. The last 5 checkpoints are averaged to obtain the final model.

However, I find it difficult to reproduce the results reported with any decay_k . (The parameter of gumbel noise is fixed to 0.8 as recommended in this repo.)

Any suggestion to tune the parameter decay_k to reproduce? Thanks a lot!

Paulmzr avatar Mar 30 '21 03:03 Paulmzr

Hi, all. I am trying to reproduce the word-level oracle results of this paper on WMT EN-DE dataset. I train the transformer model 150k steps with #Gpus = 4, #Freq = 2, #Toks = 4096 and save the checkpoints every 5k steps. The last 5 checkpoints are averaged to obtain the final model.

However, I find it difficult to reproduce the results reported with any decay_k . (The parameter of gumbel noise is fixed to 0.8 as recommended in this repo.)

Any suggestion to tune the parameter decay_k to reproduce? Thanks a lot!

Hi,can you provide your reproduced results on WMT14 EN-DE, the same thing happened to me.

Answer3664 avatar Oct 24 '21 09:10 Answer3664

Hi, all. I am trying to reproduce the word-level oracle results of this paper on WMT EN-DE dataset. I train the transformer model 150k steps with #Gpus = 4, #Freq = 2, #Toks = 4096 and save the checkpoints every 5k steps. The last 5 checkpoints are averaged to obtain the final model.

However, I find it difficult to reproduce the results reported with any decay_k . (The parameter of gumbel noise is fixed to 0.8 as recommended in this repo.)

Any suggestion to tune the parameter decay_k to reproduce? Thanks a lot!

Hi, similar settings with yours and decay schedule with the author's example, also fail to reproduce the improvement of word-level oracle, have you got any idea about this problem?

songmzhang avatar Sep 25 '22 05:09 songmzhang