Feature Request: Unknown Word Replacement
It would be nice to have options for unknown word replacement, either with the original word or using a lexicon.
I'm interested in helping for this, how would I go about doing it?
We probably need to:
- save the original Input object somewhere, maybe as additional information in the ExpressionSequence, or maybe just separately?
- in the Inference object, we need to allow it to take in this Input, and the attention weights, then if the output is an "unk" instead use the plain-text string from the Input.
There might be a more elegant way as well? This would also be very nice, as it would allow us to more easily implement the copy mechanism as well: https://github.com/neulab/xnmt/issues/221
I'm not sure which is the most elegant, but the on_start_sent handler also propagates the input sentence so maybe the inference object can just capture it this way.
Yeah, that sounds like a good idea.
On Fri, Apr 20, 2018 at 5:55 PM, msperber [email protected] wrote:
I'm not sure which is the most elegant, but the on_start_sent handler also propagates the input sentence so maybe the inference object can just capture it this way.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neulab/xnmt/issues/105#issuecomment-383232872, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYWG0xS6vHJakxdTThZbr064lXuA6_xks5tqlk4gaJpZM4OGCTg .