char-rnn.pytorch
char-rnn.pytorch copied to clipboard
Implementation change question
What's the difference between applying the loss function after each cell of the RNN vs applying on the entire sequence? In your example you feed only 1 character at a time and the results are good. I tried feeding the entire sequence of 300 characters at a time and the results were very bad. In my mind the 2 approaches are basically the same, but I can't figure out the difference. Any hint would help. Thanks a lot and sorry for posting this as an issue, but i didn't found the answer anywhere else.