xnmt icon indicating copy to clipboard operation
xnmt copied to clipboard

Feature Request: Unknown Word Replacement

Open neubig opened this issue 8 years ago • 4 comments

It would be nice to have options for unknown word replacement, either with the original word or using a lexicon.

neubig avatar Jun 27 '17 01:06 neubig

I'm interested in helping for this, how would I go about doing it?

pmichel31415 avatar Apr 20 '18 15:04 pmichel31415

We probably need to:

  1. save the original Input object somewhere, maybe as additional information in the ExpressionSequence, or maybe just separately?
  2. in the Inference object, we need to allow it to take in this Input, and the attention weights, then if the output is an "unk" instead use the plain-text string from the Input.

There might be a more elegant way as well? This would also be very nice, as it would allow us to more easily implement the copy mechanism as well: https://github.com/neulab/xnmt/issues/221

neubig avatar Apr 20 '18 16:04 neubig

I'm not sure which is the most elegant, but the on_start_sent handler also propagates the input sentence so maybe the inference object can just capture it this way.

msperber avatar Apr 20 '18 21:04 msperber

Yeah, that sounds like a good idea.

On Fri, Apr 20, 2018 at 5:55 PM, msperber [email protected] wrote:

I'm not sure which is the most elegant, but the on_start_sent handler also propagates the input sentence so maybe the inference object can just capture it this way.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neulab/xnmt/issues/105#issuecomment-383232872, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYWG0xS6vHJakxdTThZbr064lXuA6_xks5tqlk4gaJpZM4OGCTg .

neubig avatar Apr 20 '18 22:04 neubig