visual-qa trainLSTM_1.py unreshapeable error

The environment has been tested by running get_started.sh, however, when run python trainLSTM_1.py something goes wrong, saying :

Training started... Traceback (most recent call last): File "trainLSTM_1.py", line 126, in main() File "trainLSTM_1.py", line 116, in main loss = model.train_on_batch([X_q_batch, X_i_batch], Y_batch) File "/home/nate/miniconda2/lib/python2.7/site-packages/keras/models.py", line 804, in train_on_batch return self._train(ins) File "/home/nate/miniconda2/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 448, in call return self.function(*inputs) File "/home/nate/miniconda2/lib/python2.7/site-packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/home/nate/miniconda2/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/home/nate/miniconda2/lib/python2.7/site-packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() ValueError: GpuReshape: cannot reshape input of shape (640, 512) to shape (21, 30, 512). Apply node that caused the error: GpuReshape{3}(GpuElemwise{Add}[(0, 0)].0, TensorConstant{[ -1 30 512]}) Toposort index: 119 Inputs types: [CudaNdarrayType(float32, matrix), TensorType(int64, vector)] Inputs shapes: [(640, 512), (3,)] Inputs strides: [(512, 1), (8,)] Inputs values: ['not shown', array([ -1, 30, 512])] Outputs clients: [[GpuJoin(TensorConstant{2}, GpuReshape{3}.0, GpuReshape{3}.0, GpuReshape{3}.0, GpuReshape{3}.0)]]

Feb 28 '16 13:02 NateLol

it says something wrong with the reshape...can anyone help me?

Feb 28 '16 13:02 NateLol

I also got this error log...have you fixed it?

Mar 14 '16 02:03 infinity0a

I think I got what's going on. line 66 in trainLSTM_1.py language_model.add(LSTM(output_dim = args.num_hidden_units_lstm, return_sequences=False, input_shape=(max_len, word_vec_dim))) the LSTM was defined with input shape being max_len with the value 30 meaning the time steps of LSTM being fixed to 30, keep this in mind. line 112: X_q_batch = get_questions_tensor_timeseries(qu_batch, nlp, timesteps) this time the timesteps is a varying number as the different batch may have varying length, which does not match the input shape of the LSTM thus causing the unreshapeable error. If you want to know it thoroughly, check the definition of get_questions_tensor_timeseries in features.py I think you'll get what I mean.

Correct me if I'm wrong!

Mar 17 '16 06:03 NateLol

Has a mechanism to fix this been found?

Apr 18 '16 00:04 aniemerg

Based on @NateLol 's description, I modified line 108 to set timestamps equal to 30. Training appears to be working now.

Apr 19 '16 01:04 aniemerg

Hey @aniemerg, Came back to this code reading after a long period of time, I might have another better solution! Change language_model definition, the parameters of LSTM() to be LSTM(output_dim = args.num_hidden_units_lstm, return_sequences=False, input_dim=word_vec_dim) instead of compiling the model with fixed timesteps, just specify it's input_dim rather than input_shape. Thus varying lengths of sequence can be processed! However, the training error seems not decreasing over time, Any thoughts on that?

Jun 10 '16 03:06 NateLol

visual-qa visual-qa copied to clipboard

trainLSTM_1.py unreshapeable error

visual-qa
visual-qa copied to clipboard