OpenNMT-py [WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation

Open francoishernandez opened this issue 5 years ago • 2 comments

This PR intends to add an implementation of the cosine similarity alignment loss introduced as a regularization term in The Missing Ingredient in Zero-Shot Neural Machine Translation.

Jan 30 '20 17:01 francoishernandez

Note: Impact on speed is quite significant: as we need to reduce batches to make place in memory for the additional representations, we can loose up to 20-25% in training speed, both in FP32 and FP16 modes.

Feb 07 '20 17:02 francoishernandez

For the record, we discussed offline if this should be in the code of NMTModel or Trainer. For performance reason it needs to be in NMTModel (encoding through forward of src and tgt), but it makes the API a little less "clear". We opted for performance, but we kept the API intact when this new loss is not used.

Feb 17 '20 17:02 vince62s

OpenNMT-py OpenNMT-py copied to clipboard

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation

OpenNMT-py
OpenNMT-py copied to clipboard