seq2seq icon indicating copy to clipboard operation
seq2seq copied to clipboard

Parallel reading problem

Open Zeina-T opened this issue 7 years ago • 5 comments

I'm trying to run the small nmt model using the toy_reverse data. I have ran all possible unit tests (including the tensorflow parallel reader test) everything looks ok. However when I run i get the following error (tried different models/different data same error) TF v1.0 Cuda 8.0 Ubuntu 16.04

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, / [[Node: train_input_fn/parallel_read/ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](train_input_fn/parallel_read/TextLineReaderV2, train_input_fn/parallel_read/filenames)]]

Caused by op u'train_input_fn/parallel_read/ReaderReadV2', defined at: File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/zeina/tensorflow/grad/seq2seq/bin/train.py", line 277, in tf.app.run() File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/zeina/tensorflow/grad/seq2seq/bin/train.py", line 272, in main schedule=FLAGS.schedule) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 111, in run return _execute_schedule(experiment, schedule) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 46, in _execute_schedule return task() File "seq2seq/contrib/experiment.py", line 104, in continuous_train_and_eval monitors=self._train_monitors) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 281, in new_func return func(*args, **kwargs) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 430, in fit loss = self._train_model(input_fn=input_fn, hooks=hooks) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 925, in _train_model features, labels = input_fn() File "seq2seq/training/utils.py", line 260, in input_fn data_provider = pipeline.make_data_provider() File "seq2seq/data/input_pipeline.py", line 180, in make_data_provider **kwargs) File "seq2seq/data/parallel_data_provider.py", line 125, in init seed=seed) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/data/parallel_reader.py", line 234, in parallel_read reader_kwargs=reader_kwargs).read(filename_queue) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/data/parallel_reader.py", line 132, in read enqueue_ops.append(self._common_queue.enqueue(reader.read(queue))) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 193, in read return gen_io_ops._reader_read_v2(self._reader_ref, queue_ref, name=name) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 411, in _reader_read_v2 queue_handle=queue_handle, name=name) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/zeina/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in init self._traceback = _extract_stack()

FailedPreconditionError (see above for traceback): / [[Node: train_input_fn/parallel_read/ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](train_input_fn/parallel_read/TextLineReaderV2, train_input_fn/parallel_read/filenames)]]

I do get the same error a few times in a single run actually with changes in the nodes at the end only. I hope someone can refer me to what may be causing this

Zeina-T avatar May 06 '17 12:05 Zeina-T

I am getting the same error. Did you find a solution to this ?

p-singh avatar Jul 05 '17 16:07 p-singh

no.. I even tried it on an aws machine with pre-installed tensorflow and it also didn't work!

Zeina-T avatar Jul 09 '17 17:07 Zeina-T

I'm wondering if you made mistakes here:

  --input_pipeline_train "
    class: ParallelTextInputPipeline
    params:
       source_files: 
        - $TRAIN_SOURCES
       target_files:
        - $TRAIN_TARGETS" \
  --input_pipeline_dev "
    class: ParallelTextInputPipeline
    params:
      source_files: 
       - $DEV_SOURCES
      target_files: 
       - $DEV_TARGETS" 

you should be careful that not to use \t here. furthermore maybe you made mistake in the - part before $TRAIN_SOURCES.

dadashkarimi avatar Sep 06 '17 15:09 dadashkarimi

I got the same error, and solved by @javiddadashkarimi 's solution. I just used the right '-' in the yaml file. Thanks!

chao-su avatar Dec 22 '17 09:12 chao-su

@Zeina-T

Hi, I think you have not initialised the global variables after creating the graph. before doing any operation on graph you can use following; sess.run(tf.global_variables_initializer())

as long as you don't do this, the weights and biased in your network graph wound be initialised. so you get FailedPreconditionError .

I had same kind of problems sometimes. I found this to be working. :)

Hope it will help you.

Biswajit2902 avatar Mar 18 '18 09:03 Biswajit2902