Coverage vector

Open Waino opened this issue 9 years ago • 1 comments

Implement coverage in the attention mechanism, following [1].

[1] Tu, Zhaopeng, et al. "Coverage-based Neural Machine Translation." arXiv preprint arXiv:1601.04811 (2016). http://arxiv.org/pdf/1601.04811

Nov 25 '16 14:11 Waino

Perhaps a better (= quicker to implement) start would be to implement the decoding-time coverage normalization method from section 7 of the Google NMT paper. This would only require changing the HNMT code so that it returns attention predictions, and then modifying the beam search code in BNAS to use it.

Dec 07 '16 10:12 robertostling