xlnet icon indicating copy to clipboard operation
xlnet copied to clipboard

Error while using use_bfloat16 in run_classifier.py

Open ericwtlin opened this issue 5 years ago • 2 comments

Hi, when I run run_classifier.py with use_bfloat16=True, I met the followint exception:

File "/home/wutlin/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "run_classifier.py", line 626, in model_fn FLAGS, features, n_class, is_training) File "/home/wutlin/workspace/XLNet/function_builder.py", line 152, in get_classification_loss input_mask=inp_mask) File "/home/wutlin/workspace/XLNet/xlnet.py", line 222, in init ) = modeling.transformer_xl(**tfm_args) File "/home/wutlin/workspace/XLNet/modeling.py", line 497, in transformer_xl data_mask = tf.concat([mems_mask, data_mask], 1) File "/home/wutlin/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "/home/wutlin/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1256, in concat return gen_array_ops.concat_v2(values=values, axis=axis, name=name) File "/home/wutlin/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1149, in concat_v2 "ConcatV2", values=values, axis=axis, name=name) File "/home/wutlin/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 483, in _apply_op_helper raise TypeError("%s that don't all match." % prefix) TypeError: Tensors in list passed to 'values' of 'ConcatV2' Op have types [bfloat16, float32] that don't all match.

Could you help take a look?

ericwtlin avatar Jun 28 '19 05:06 ericwtlin

Afaik, bfloat16 should be used on TPUs.

kimiyoung avatar Jun 28 '19 06:06 kimiyoung

I have used TPU to run run_squad.py, but the same error happened.

File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu.py", line 890, in split_compile_and_shard name=name) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu.py", line 689, in split_compile_and_replicate outputs = computation(*computation_inputs) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2886, in multi_tpu_train_steps_on_single_shard [_INITIAL_LOSS]) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/training_loop.py", line 208, in repeat cond, body_wrapper, inputs=inputs, infeed_queue=infeed_queue, name=name) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/training_loop.py", line 170, in while_loop condition_wrapper, body_wrapper, inputs, name="", parallel_iterations=1) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3556, in while_loop return_same_structure) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3087, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3022, in _BuildLoop body_result = body(packed_vars_for_body) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/training_loop.py", line 121, in body_wrapper outputs = body((inputs + dequeue_ops)) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/training_loop.py", line 204, in body_wrapper return [i + 1] + _convert_to_list(body(*args)) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1359, in train_step self._call_model_fn(features, labels)) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1593, in _call_model_fn estimator_spec = self._model_fn(features=features, **kwargs) File "run_squad.py", line 1026, in model_fn outputs = function_builder.get_qa_outputs(FLAGS, features, is_training) File "/home/guozhiyu0914/xlnet/function_builder.py", line 230, in get_qa_outputs input_mask=inp_mask) File "/home/guozhiyu0914/xlnet/xlnet.py", line 222, in init ) = modeling.transformer_xl(**tfm_args) File "/home/guozhiyu0914/xlnet/modeling.py", line 499, in transformer_xl data_mask = tf.concat([mems_mask, data_mask], 1) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1256, in concat return gen_array_ops.concat_v2(values=values, axis=axis, name=name) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1149, in concat_v2 "ConcatV2", values=values, axis=axis, name=name) File "/home/guozhiyu0914/anaconda3/envs/py23/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 483, in _apply_op_helper raise TypeError("%s that don't all match." % prefix) TypeError: Tensors in list passed to 'values' of 'ConcatV2' Op have types [bfloat16, float32] that don't all match.

guozhiyu avatar Sep 24 '19 01:09 guozhiyu