pytorch-dilated-rnn icon indicating copy to clipboard operation
pytorch-dilated-rnn copied to clipboard

Question about Parameter-size

Open kayuksel opened this issue 5 years ago • 0 comments

Hello! Thank you very much for sharing this work. In Table 2 of the paper, GRU and 5-layer Dilated GRU has the same number of parameter-size. Can you please help me to understand how this is possible? Would they also have the same GPU memory requirement, despite significantly higher number of layers of the Dilated version?

I would also like to take some advise on whether Dilated GRU also be helpful when the sequence-length is not too high e.g. smaller than 32? Is there any way to pick the maximum number of layers to experiment with based on the sequence-length? Lastly, how can I most easily modify your code so that it becomes stateless? Thanks again!

kayuksel avatar Dec 04 '19 17:12 kayuksel