practical-pytorch Question from character level RNN classifier, why not use the hidden state across epochs?

Question from character level RNN classifier, why not use the hidden state across epochs?

Open labJunky opened this issue 6 years ago • 1 comments

In the RNN classification example, using characters of names to predict the names language, the train function re-zeros the hidden state (and gradient) every epoch. I was wondering why this is done, instead of carrying over the final hidden states of the epoch before?

Nov 27 '19 06:11 labJunky

One epoch means a run-through of a word. If we start a new epoch, which means we are training the network with a new word, we need to redefine the hidden state of the initial letter of the new word, since states of different words are independent.

Apr 02 '20 20:04 ZhouXing19

practical-pytorch practical-pytorch copied to clipboard

Question from character level RNN classifier, why not use the hidden state across epochs?

practical-pytorch
practical-pytorch copied to clipboard