PSPNet-tensorflow icon indicating copy to clipboard operation
PSPNet-tensorflow copied to clipboard

Training crashes on Cityscapes

Open kshitijagrwl opened this issue 6 years ago • 2 comments

I'm training PSPNet using the train.py script provided - i've tried running it on GTX1080 and TitanX . It always crashes after about 500 steps. Log below:

step 590         loss = 0.266, (0.723 sec/step)
Traceback (most recent call last):
  File "train.py", line 219, in <module>
    main()
  File "train.py", line 210, in main
    loss_value, _ = sess.run([reduced_loss, train_op], feed_dict=feed_dict)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
         [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]

Caused by op u'create_inputs/batch', defined at:
  File "train.py", line 219, in <module>
    main()
  File "train.py", line 121, in main
    image_batch, label_batch = reader.dequeue(args.batch_size)
  File "/home/ml/codes/PSPNet-tensorflow/image_reader.py", line 116, in dequeue
    num_elements)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 927, in batch
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 722, in _batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 464, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2418, in _queue_dequeue_many_v2
    component_types=component_types, timeout_ms=timeout_ms, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

OutOfRangeError (see above for traceback): FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
         [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]

kshitijagrwl avatar Apr 17 '18 09:04 kshitijagrwl

Maybe you need to check your feed data at 500 steps, I guess the size is not same as before so that the crashes occur.

Strand2013 avatar Sep 25 '18 08:09 Strand2013

I was strained with the same error too but however after a close at the error and debugging, i found out the solution. Check the cityscapes_train_list.txt file in the list folder and make sure that you do not have any extra/empty lines. It is basically trying to take the empty line as an input but not able to find the required image since it is not mentioned. Hence the error "has insufficient elements (requested 1, current size 0)". It is a simple logical error and a code limitation. @RaceSu @kshitijagrwl

narendoraiswamy avatar Oct 05 '18 13:10 narendoraiswamy