OR-NMT icon indicating copy to clipboard operation
OR-NMT copied to clipboard

关于Oracle Word Selection

Open goodluck110706112 opened this issue 4 years ago • 1 comments

我按照论文思路,在lstm上使用了Oracle Word Selection+schedule sample,但是发现效果提升很微弱,我估计是我哪里设置的不对吧。 有一个问题,关于Oracle Word Selection

在论文公式11中,我直接选择了argmax oj-1作为最终的oracle word,没有经过softmax,我这样做的原因是argmax oj-1其实就是argmax Pj-1,也就是不需要经过softmax就可以得到oracle word,为什么这里要加公式12,也就是softmax呢?

goodluck110706112 avatar Aug 31 '20 02:08 goodluck110706112

Reply to: 我按照论文思路,在lstm上使用了Oracle Word Selection+schedule sample,但是发现效果提升很微弱,我估计是我哪里设置的不对吧。 Yes, you need to carefully select the hyperparameter k according to different model architectures and different datasets. I can't find the parameters of the rnn-based model now, sorry about that. Please refer to the selection of hyperparameters on the attention-based model in the Readme.

Reply to: 有一个问题,关于Oracle Word Selection 在论文公式11中,我直接选择了argmax oj-1作为最终的oracle word,没有经过softmax,我这样做的原因是argmax oj-1其实就是argmax Pj-1,也就是不需要经过softmax就可以得到oracle word,为什么这里要加公式12,也就是softmax呢? Yes, you are right, argmax(o_{j-1}) is the same as argmax(P_{j-1}). In fact, the softmax operation is not needed in the code implementation.

zhang-wen avatar Sep 08 '20 17:09 zhang-wen