chainer-char-rnn icon indicating copy to clipboard operation
chainer-char-rnn copied to clipboard

Question: minibatch data is not contiguous?

Open akitakeuchi opened this issue 9 years ago • 2 comments

Hi,

Thank you for the great contribution. The program works fine with tinyshakespeare dataset and other dataset, however part of "train.py" code looks quite strange to me. Line 87-91:

for i in xrange(jump * n_epochs): x_batch = np.array([train_data[(jump * j + i) % whole_len] for j in xrange(batchsize)]) y_batch = np.array([train_data[(jump * j + i + 1) % whole_len] for j in xrange(batchsize)])

While "train_data" is the source character sequence, x_data seems to consist of characters from separate positions, that is, from every "jump" distant positions. To train RNN, internal state must be carried over to next input, but this minibatch data seems to violate this input data continuity. I would appreciate if you explain why the code works fine. Thanks.

akitakeuchi avatar Nov 09 '15 08:11 akitakeuchi

As far as I understand, in general, a minibatch should process independent examples (for the gradient to be a good estimation of the global gradient). In RNNs, examples are not independent, but if we take the minibatch from far away characters, we get a good approximation. So the minibatch acts like a rake in which teeth are separated by the jump value, and which is moved from character to next.

benob avatar Nov 13 '15 14:11 benob

Thank you for the comment. I got the point.

akitakeuchi avatar Nov 15 '15 14:11 akitakeuchi