practical-pytorch icon indicating copy to clipboard operation
practical-pytorch copied to clipboard

Question from character level RNN classifier, why not use the hidden state across epochs?

Open labJunky opened this issue 6 years ago • 1 comments

In the RNN classification example, using characters of names to predict the names language, the train function re-zeros the hidden state (and gradient) every epoch. I was wondering why this is done, instead of carrying over the final hidden states of the epoch before?

labJunky avatar Nov 27 '19 06:11 labJunky

One epoch means a run-through of a word. If we start a new epoch, which means we are training the network with a new word, we need to redefine the hidden state of the initial letter of the new word, since states of different words are independent.

ZhouXing19 avatar Apr 02 '20 20:04 ZhouXing19