Kyle Gao

Results 6 issues of Kyle Gao

[Pointer network](https://arxiv.org/pdf/1506.03134.pdf) and the models presented in [this paper](https://arxiv.org/abs/1511.06391) are useful models for combinatorial problems, e.g. reversing a sequence.

I sometimes notice that not using teacher forcing at all gives better results at inference time than using teacher forcing all the time. [This paper](https://papers.nips.cc/paper/5956-scheduled-sampling-for-sequence-prediction-with-recurrent-neural-networks.pdf) provides evidence for this behavior...

enhancement
medium priority
feature

Benchmark with WMT machine translation dataset so that the performance of the library can be evaluated and compared with other implementations.

enhancement
high priority
contributions welcome

Researches have shown that adversarial loss is more effective than MLE training, consider developing an adversarial trainer. https://arxiv.org/abs/1704.06933 https://arxiv.org/abs/1703.04887

enhancement
contributions welcome

As configuring an experiment becomes more complicated with more features, it would be easier to read experiment configurations from a file and build the experiment.

This paper discussed and evaluated several regularization and optimization methods and gave the ablations on each techniques. It'd be interesting to experiment some techniques on seq2seq. https://arxiv.org/pdf/1708.02182.pdf