papernotes
papernotes copied to clipboard
When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?
Metadata
- Authors: Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Janani Padmanabhan and Graham Neubig
- Organization: Language Technologies Institute, Carnegie Mellon University
- Release Date: 2018 on Arxiv
- Link: https://arxiv.org/pdf/1804.06323.pdf
Summary
- Pre-training the word embeddings in the source and/or target languages helps to increase BLEU scores.
- Pre-training source language embeddings gains much improvement, indicating that better encoding of the source sentence is important.
- Word embeddings are most effective, where there is very little training data but not so little that the system cannot be trained at all.
- The gain from pre-training of embeddings may be larger when the source and target languages are more similar.
- A priori alignment of embeddings may not be necessary in bilingual scenarios, but is helpful in multi-lingual training scenarios.