show-attend-and-tell icon indicating copy to clipboard operation
show-attend-and-tell copied to clipboard

ValueError in train.py

Open JaneLou opened this issue 7 years ago • 5 comments

work@lab-server03:~/ljz/show-attend-and-tell-master$ python train.py image_idxs <type 'numpy.ndarray'> (399998,) int32 file_names <type 'numpy.ndarray'> (82783,) <U55 word_to_idx <type 'dict'> 23110 features <type 'numpy.ndarray'> (82783, 196, 512) float32 captions <type 'numpy.ndarray'> (399998, 17) int32 Elapse time: 198.26 image_idxs <type 'numpy.ndarray'> (19589,) int32 file_names <type 'numpy.ndarray'> (4052,) <U51 features <type 'numpy.ndarray'> (4052, 196, 512) float32 captions <type 'numpy.ndarray'> (19589, 17) int32 Elapse time: 3.67 Traceback (most recent call last): File "train.py", line 25, in main() File "train.py", line 22, in main solver.train() File "/home/work/ljz/show-attend-and-tell-master/core/solver.py", line 86, in train train_op = optimizer.apply_gradients(grads_and_vars=grads_and_vars) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 412, in apply_gradients self._create_slots(var_list) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/adam.py", line 119, in _create_slots self._zeros_slot(v, "m", self._name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 656, in _zeros_slot named_slots[var] = slot_creator.create_zeros_slot(var, op_name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 123, in create_zeros_slot colocate_with_primary=colocate_with_primary) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 101, in create_slot return _create_slot_var(primary, val, '') File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 55, in _create_slot_var slot = variable_scope.get_variable(scope, initializer=val, trainable=False) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 988, in get_variable custom_getter=custom_getter) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 890, in get_variable custom_getter=custom_getter) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 348, in get_variable validate_shape=validate_shape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 333, in _true_getter caching_device=caching_device, validate_shape=validate_shape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 657, in _get_single_variable "VarScope?" % name) ValueError: Variable conv_featuresbatch_norm/beta/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

JaneLou avatar Apr 22 '17 03:04 JaneLou

did you change the line "self.optimizer = tf.train.AdamOptimizer" in init ?

arieling avatar Apr 23 '17 01:04 arieling

@arieling No, I didn't. Since the version of my tensorflow is 1.1.0, I only replace some basic functions, now the error is as follows: work@lab-server03:~/ljz/show-attend-and-tell-master$ python train.py image_idxs <type 'numpy.ndarray'> (399998,) int32 file_names <type 'numpy.ndarray'> (82783,) <U55 word_to_idx <type 'dict'> 23110 features <type 'numpy.ndarray'> (82783, 196, 512) float32 captions <type 'numpy.ndarray'> (399998, 17) int32 Elapse time: 13.43 image_idxs <type 'numpy.ndarray'> (19589,) int32 file_names <type 'numpy.ndarray'> (4052,) <U51 features <type 'numpy.ndarray'> (4052, 196, 512) float32 captions <type 'numpy.ndarray'> (19589, 17) int32 Elapse time: 0.65 Traceback (most recent call last): File "train.py", line 25, in main() File "train.py", line 22, in main solver.train() File "/home/work/ljz/show-attend-and-tell-master/core/solver.py", line 81, in train _, _, generated_captions = self.model.build_sampler(max_len=20) File "/home/work/ljz/show-attend-and-tell-master/core/model.py", line 216, in build_sampler _, (c, h) = lstm_cell(inputs=tf.concat(axis=1, values=[x, context]), state=[c, h]) File "/home/work/.local/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 235, in call with _checked_scope(self, scope or "basic_lstm_cell", reuse=self._reuse): File "/usr/lib/python2.7/contextlib.py", line 17, in enter return self.gen.next() File "/home/work/.local/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 93, in _checked_scope "the argument reuse=True." % (scope_name, type(cell).name)) ValueError: Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'lstm/basic_lstm_cell'; and the cell was not constructed as BasicLSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

JaneLou avatar Apr 23 '17 04:04 JaneLou

@JaneLou You should use TensorFlow 0.11 to run the code in this repo.

yunjey avatar Apr 23 '17 10:04 yunjey

In file core/solver.py, try to change

        tf.get_variable_scope().reuse_variables()
        _, _, generated_captions = self.model.build_sampler(max_len=20)

        with tf.name_scope('optimizer'):
            optimizer = self.optimizer(learning_rate=self.learning_rate)
            grads = tf.gradients(loss, tf.trainable_variables())
            grads_and_vars = list(zip(grads, tf.trainable_variables()))
            train_op = optimizer.apply_gradients(grads_and_vars=grads_and_vars)

to

        with tf.variable_scope(tf.get_variable_scope()) as scope:
            with tf.name_scope('optimizer'):
                tf.get_variable_scope().reuse_variables()
                _, _, generated_captions = self.model.build_sampler(max_len=20)
                optimizer = self.optimizer(learning_rate=self.learning_rate)
                grads = tf.gradients(loss, tf.trainable_variables())
                grads_and_vars = list(zip(grads, tf.trainable_variables()))
        train_op = optimizer.apply_gradients(grads_and_vars=grads_and_vars)

This can help fix the error "ValueError: Variable conv_featuresbatch_norm/beta/Adam/ does not exist". I am testing it with tensorflow 1.0.0

jiecaoyu avatar May 10 '17 14:05 jiecaoyu

@jiecaoyu Thanks a lot! I finally refer to the file core/new_solve.py from https://github.com/chychen/caption_generation_with_visual_attention it works for me!

JaneLou avatar May 11 '17 01:05 JaneLou