OR-NMT 关于Oracle Word Selection

关于Oracle Word Selection

Open goodluck110706112 opened this issue 4 years ago • 1 comments

我按照论文思路，在lstm上使用了Oracle Word Selection+schedule sample，但是发现效果提升很微弱，我估计是我哪里设置的不对吧。有一个问题，关于Oracle Word Selection

在论文公式11中，我直接选择了argmax oj-1作为最终的oracle word，没有经过softmax，我这样做的原因是argmax oj-1其实就是argmax Pj-1，也就是不需要经过softmax就可以得到oracle word，为什么这里要加公式12，也就是softmax呢？

Aug 31 '20 02:08 goodluck110706112

Reply to: 我按照论文思路，在lstm上使用了Oracle Word Selection+schedule sample，但是发现效果提升很微弱，我估计是我哪里设置的不对吧。 Yes, you need to carefully select the hyperparameter k according to different model architectures and different datasets. I can't find the parameters of the rnn-based model now, sorry about that. Please refer to the selection of hyperparameters on the attention-based model in the Readme.

Reply to: 有一个问题，关于Oracle Word Selection 在论文公式11中，我直接选择了argmax oj-1作为最终的oracle word，没有经过softmax，我这样做的原因是argmax oj-1其实就是argmax Pj-1，也就是不需要经过softmax就可以得到oracle word，为什么这里要加公式12，也就是softmax呢？ Yes, you are right, argmax(o_{j-1}) is the same as argmax(P_{j-1}). In fact, the softmax operation is not needed in the code implementation.

Sep 08 '20 17:09 zhang-wen

OR-NMT OR-NMT copied to clipboard

关于Oracle Word Selection

OR-NMT
OR-NMT copied to clipboard