Derrick
Derrick
https://arxiv.org/pdf/1901.00884v1.pdf   # 4. Conclusion 1. It is easy to construct examples where the intermediate representations have no match, even when the outputs of the network are identical. 2....
https://arxiv.org/abs/1711.00937 # Abstract - paper proposes model(**VQ-VAE**) that learns "discrete representations" - differs from VAEs - encode network outputs discrete (means not continuous) - prior learnt than static - circumvent...
# Abstract - quantum probability theory - quantum-inspired LMs have two limitations - not taken into account interaction among the words with multiple meanings - lacking theoretical foundation accounting for...
https://arxiv.org/abs/1806.05635 # Abstract - `SIL(Self Imitation Learning)` is to verify past good experiences can indirectly drive deep exploration. - competitive to state-of-the-art # 1. Introduction - Atari "Montezuma's Revenge" -...
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.66.5667&rep=rep1&type=pdf # 1. Introduction - In text domain, there is excessive number of features - To control sparsity, "threshold to cut-off feature" was used traditionally - This paper suggests that...
https://arxiv.org/abs/1609.04938 # 1. Abstract - this model is end-to-end - model uses convolutional network and recurrent network - current models achieve 25% accuracy, but paper model achieves 75% accuracy #...
https://arxiv.org/abs/1806.02847 google brain # Abstract - paper's model outperforms previous state-of-the-art methods in [Winograd Schema challenges](https://en.wikipedia.org/wiki/Winograd_Schema_Challenge) - paper model uses large ***RNN Language Model*** # 1. Introduction - previous models...
https://arxiv.org/abs/1806.01822 Paper from Deepmind # Abstract - paper's model built with intuition - so it is unclear why this works well - shows state-of-the-art on the WikiText103, Project Gutenberg, and...
https://arxiv.org/abs/1612.08083 # Abstract - propose gating mechanism - uses [WikiText-103](https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset) and [Google Billion Words](https://github.com/ciprian-chelba/1-billion-word-language-modeling-benchmark) - proposed model is very competitive to strong recurrent models on large scale language tasks #...
https://arxiv.org/abs/1801.10198 published as a conference paper at ICLR 2018 # Abstract - This model uses decoder-only architecture which is modified from Transformer - Evaluation uses perplexity, ROUGE, human evaluations #...