understanding-ai
understanding-ai copied to clipboard
Dual Learning for Machine Translation
https://arxiv.org/abs/1611.00179 paper from USTC, PKU, Microsoft Research (NIPS 2016)
Summary
Model
- Prepare 2 agents LM_{a,b} which learned languages(en,fr; WMT14) that outputs log probability
- 2 translation models P(•|s;Θ_{AB,BA}) are needed
- Feed P's output to LM and use policy gradient to train
- Flip data and restart training until model converges
Abstract
- dual-NMT uses reinforcement learning process
- and it works very well
1. Introduction
- Parallel data are costly in Machine Translation(MT)
- Two methods using monolingual data, proposed before this paper
- use monolingual data, and then integrate with parallel bilingual data trained model
- generate pseudo pair (untrustable method)
- train aligned parallel corpora model
- generate pseudo bilingual sentence pair
- use subsequent learning process
- Dual learning mechanism
- two agent communication game
- feedback based
2. Background: Neural Machine Translation
- talks about typical NMT model with using attention
3. Dual Learning from Neural Machine Translation
5. Discussions
- Dual learning is generally applicable to (and already forms in)
- Speech recognition vs Text to speech
- Image caption vs image generation
- Question answering vs Question generation
- Search vs Keyword extraction
- etc.
- Not restricted to two tasks (can be generalized as close-loop learning)
- Not only pair of languages, but also can use tuple of multiple(3+) monolinugal data