bidirectional_RNN icon indicating copy to clipboard operation
bidirectional_RNN copied to clipboard

Error when running

Open daquang opened this issue 10 years ago • 5 comments

I get the following error when I run the IMDB example:

Traceback (most recent call last): File "imdb_birnn.py", line 77, in model.add(BatchNormalization((24 * maxseqlen,))) File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/containers.py", line 40, in add layer.init_updates() File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/normalization.py", line 38, in init_updates X = self.get_input(train=True) File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input return self.previous.get_output(train=train) File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 296, in get_output X = self.get_input(train) File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input return self.previous.get_output(train=train) File "/home/dxquang/bidirectional_RNN/birnn.py", line 187, in get_output forward = self.get_forward_output(train) File "/home/dxquang/bidirectional_RNN/birnn.py", line 143, in get_forward_output X = X.dimshuffle((1,0,2)) File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/var.py", line 341, in dimshuffle pattern) File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 141, in init (i, j, len(input_broadcastable))) ValueError: new_order[2] is 2, but the input only has 2 axes.

daquang avatar Sep 25 '15 07:09 daquang

i see, the default of return_sequence for bidirectional rnn is set to false, to fixed that, i just added model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat'), return_sequences=True) to replace model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat')) https://github.com/hycis/bidirectional_RNN/blob/master/imdb_birnn.py#L72

hycis avatar Sep 25 '15 08:09 hycis

I believe you still have some errors. In your newest version, you have these lines:

model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat'), return_sequences=True) model.add(BiDirectionLSTM(100, 24, output_mode='sum'), return_sequences=True)

After changing these two lines as follows, the code works as intended: model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat', return_sequences=True)) model.add(BiDirectionLSTM(100, 24, output_mode='sum', return_sequences=True))

daquang avatar Sep 25 '15 22:09 daquang

yap, you are right, thanks for pointing out.

hycis avatar Sep 26 '15 02:09 hycis

I am new to bidirectional LSTM, sorry if this is too trivial.

I have some doubt in following lines: --- Stacked up BiDirectionLSTM layers --- model.add(BiDirectionLSTM(word_vec_len, 50, output_mode='concat', return_sequences=True)) model.add(BiDirectionLSTM(100, 24, output_mode='sum', return_sequences=True))

If this is a stacked LSTM, should not output of 1st layer(50) be equal to input of second layer(100). It would be nice if you can help me in understanding that part.

shwetgarg avatar Oct 19 '15 15:10 shwetgarg

@shwetgarg Because it's a bidirectional, so there is one output from the forward pass and one output from the backward pass, so we can either 'concat' the outputs which give us twice the vector length or simply 'sum' the outputs which return the same length. So for the first LSTM, I use output_mode='concat', that's why it's double

hycis avatar Oct 21 '15 02:10 hycis