keras-image-captioning icon indicating copy to clipboard operation
keras-image-captioning copied to clipboard

After first epoch of training, it raises ValueError

Open HankyuJang opened this issue 6 years ago • 5 comments

I am trying to run the code on the GPU using AWS EC2 machine which CUDA 9 is preinstalled. I fixed two small changes in inference.py coming from different versions of Keras:

(1) pickle_safe=False -> use_multiprocessing=True (2) max_q_size -> max_queue_size

It seems to be training fine for the first epoch, however it raises ValueError. I am not sure what I should do here.. Do you have any suggestions?

2018-04-13 00:52:18 | Training model8 is starting..
/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py:109: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(verbose=1, generator=<generator..., workers=1, validation_data=<generator..., steps_per_epoch=938, epochs=33, callbacks=[<keras_im..., max_queue_size=10, validation_steps=157)`
  verbose=self._verbose)
Epoch 1/33
938/938 [==============================] - 523s 557ms/step - loss: 2.5325 - categorical_accuracy_wvt: 0.2447 - val_loss: 2.1476 - val_categorical_accuracy_wvt: 0.3012
  0%|     
Traceback (most recent call last):
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 283, in <module>
    fire.Fire(main)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 276, in main
    training.run()
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 109, in run
    verbose=self._verbose)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/engine/training.py", line 2262, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/callbacks.py", line 77, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "keras_image_captioning/callbacks.py", line 55, in on_epoch_end
    in self._inference.evaluate_training_set().items()})
  File "keras_image_captioning/inference.py", line 48, in evaluate_training_set
    return self._evaluate(self.predict_training_set(include_datum=True),
  File "keras_image_captioning/inference.py", line 35, in predict_training_set
    include_datum)
  File "keras_image_captioning/inference.py", line 79, in _predict
    X, y, datum_batch = generator_output
ValueError: need more than 2 values to unpack

HankyuJang avatar Apr 13 '18 01:04 HankyuJang

Have you installed the exact version for each library in https://github.com/danieljl/keras-image-captioning/blob/master/requirements.txt? Keras is known to break APIs even in minor version update.

EDIT: Oh, you have a different version of Keras. I suggest you to install the exact same version of Keras and other dependencies in requirements.txt.

danieljl avatar Apr 13 '18 10:04 danieljl

I see. One thing to make sure, the codes are running on CPU right? I also installed exact version for each library as requirements.txt, but I found that it installs tensorflow not the tensorflow-gpu. It seemed to take so much time in training the data each epoch, so I was trying to run it on GPU.

HankyuJang avatar Apr 13 '18 14:04 HankyuJang

The codes can also run on GPU. Just uninstall tensorflow and install tensorflow-gpu with the same version.

The reason why there is tensorflow and not tensorflow-gpu in requirements.txt is because I built TensorFlow from source.

danieljl avatar Apr 14 '18 02:04 danieljl

I am also facing the same issue. From requirements.txt, it installs tensorflow 1.1.0. When I try to install same version of tensorflow-gpu (as you are guiding), it shows this error DistributionNotFound: No matching distribution found for 1.1.0.

Is there any work around to use this code on GPU?

aakashgupta96 avatar May 15 '18 20:05 aakashgupta96

Have you found any way to resolve this issue?

mememimis avatar Sep 22 '18 21:09 mememimis