Jean Senellart

Results 16 issues of Jean Senellart

I had a problem when handling a grid with quite small height (was showing only ~10rows), and scrolling vertically very fast through the list (I am using a touchpad): in...

option to quantize weights in model into INT16 (short) - reduces t7 size by 2.

* implement label smoothing as defined in [Szegedy, 2015](https://arxiv.org/pdf/1512.00567.pdf) - uniform distribution

In [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf) paper, several concepts are introduced that can fit in our current attention module: * So-called "Scaled Dot-Product Attention" - improving `dot` model (option `-global_attention...

variational dropout as described in [Gal et al., 2016](https://arxiv.org/pdf/1512.05287.pdf) does not have the expected result for NMT. Adding a new mode `variational_non_recurrent` for further exploration.

add local attention models and monotonic attention

better memory optimization using the computation graph, implement first vertical memory sharing

experiment with coverage models: - [Temporal Attention Model for Neural Machine Translation](https://arxiv.org/pdf/1608.02927.pdf) - [Modeling Coverage for Neural Machine Translation](http://www.aclweb.org/anthology/P16-1008)