awd-lstm-lm
awd-lstm-lm copied to clipboard
does bptt length affect test perplexity?
pytorch LM example code and this paper use bptt of 35 https://openreview.net/forum?id=ByJHuTgA- but this repo use 70 does it affect test perplexity?