tensor2tensor icon indicating copy to clipboard operation
tensor2tensor copied to clipboard

The evolved transformer code is the final graph or the whole procedure to find the best graph?

Open guotong1988 opened this issue 5 years ago • 6 comments

I'm new to neural architecture search. Thank you.

guotong1988 avatar Mar 25 '19 02:03 guotong1988

@guotong1988 I am facing this issue when using evolved transformer model. tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node evolved_transformer/parallel_0_5/evolved_transformer/evolved_transformer/body/encoder/layer_0/conv_branches/standard_conv_3x1/conv1d}}]] [[evolved_transformer/parallel_0_5/evolved_transformer/evolved_transformer/body/decoder/layer_4/second_attend_to_encoder/multihead_attention/dot_product_attention/Max/_12719]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node evolved_transformer/parallel_0_5/evolved_transformer/evolved_transformer/body/encoder/layer_0/conv_branches/standard_conv_3x1/conv1d}}]] 0 successful operations. 0 derived errors ignored.

Have you faced this issue?

minump avatar Aug 05 '19 20:08 minump

@minump TensorFlow version problem, you can google it.

guotong1988 avatar Aug 06 '19 00:08 guotong1988

@minump By the way, you should format your post. It is hard to read it.

guotong1988 avatar Aug 06 '19 00:08 guotong1988

@guotong1988 Thanks for the update. I was able to run other models in the same framework(tensorflow, CUDA, cudnn). Faced this issue only with evolved_transformer. Hence thought the issue with either the model or my implementation. Sry for the copy-paste of the error.

minump avatar Aug 06 '19 00:08 minump

evolved_transformer has the conv op, your other models do not have the conv op.

guotong1988 avatar Aug 06 '19 00:08 guotong1988

I am facing this issue when using evolved transformer model,when i decoder the model. 2 root error(s) found. (0) Not found: Key evolved_transformer/body/decoder/layer_0/first_attend_to_encoder/multihead_attention/k/kernel not found in checkpoint [[node save/RestoreV2 (defined at /lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py:629) ]] (1) Not found: Key evolved_transformer/body/decoder/layer_0/first_attend_to_encoder/multihead_attention/k/kernel not found in checkpoint [[node save/RestoreV2 (defined at /lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py:629) ]] [[save/RestoreV2/_1417]] 0 successful operations. 0 derived errors ignored.

Original stack trace for 'save/RestoreV2': File "/bin/t2t-decoder", line 20, in tf.app.run() File "/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/lib/python3.7/site-packages/absl/app.py", line 300, in run _run_main(main, args) File "/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "/bin/t2t-decoder", line 12, in main t2t_decoder.main(argv) File "/lib/python3.7/site-packages/tensor2tensor/bin/t2t_decoder.py", line 214, in main decode(estimator, hp, decode_hp) File "/lib/python3.7/site-packages/tensor2tensor/bin/t2t_decoder.py", line 99, in decode checkpoint_path=FLAGS.checkpoint_path) File "/lib/python3.7/site-packages/tensor2tensor/utils/decoding.py", line 477, in decode_from_file for elapsed_time, result in timer(result_iter): File "/lib/python3.7/site-packages/tensor2tensor/utils/decoding.py", line 471, in timer item = next(gen) File "/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 629, in predict hooks=all_hooks) as mon_sess: File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1038, in init stop_grace_period_secs=stop_grace_period_secs) File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 749, in init self._sess = _RecoverableSession(self._coordinated_creator) File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1231, in init _WrappedSession.init(self, self._create_session()) File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1236, in _create_session return self._sess_creator.create_session() File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 902, in create_session self.tf_sess = self._session_creator.create_session() File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 660, in create_session self._scaffold.finalize() File "/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 235, in finalize self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 607, in _get_saver_or_default saver = Saver(sharded=True, allow_empty=True) File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 836, in init self.build() File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 848, in build self._build(self._filename, build_save=True, build_restore=True) File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 886, in _build build_restore=build_restore) File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 510, in _build_internal restore_sequentially, reshape) File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 389, in _AddShardedRestoreOps name="restore_shard")) File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 336, in _AddRestoreOps restore_sequentially) File "/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 583, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1524, in restore_v2 name=name) File "/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 744, in _apply_op_helper attrs=attr_protos, op_def=op_def) File "/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3485, in _create_op_internal op_def=op_def) File "/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1949, in init self._traceback = tf_stack.extract_stack()

Nanamumuhan avatar Nov 12 '20 15:11 Nanamumuhan