TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 icon indicating copy to clipboard operation
TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 copied to clipboard

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, 2 root error(s) found.

Open nguyenanhtuan1008 opened this issue 5 years ago • 1 comments
trafficstars

I trying to training model on faster_rcnn_inception_v2_coco_2018_01_28 with command python _5train.py --logtostderr --train_dir=Exhibition_model_faster_rcnn_inception_v2_pets/ --pipeline_config_path=Exhibition_training/faster_rcnn_inception_v2_pets.config. I already try it on dataset 300x300, 720x1280, and random size also but no luck. What did I do wrong?

2020-02-12 12:06:56.281203: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2020-02-12 12:06:57.178206: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-02-12 12:06:58.398383: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows Relying on driver to perform ptx compilation. This message will be only logged once. INFO:tensorflow:Recording summary at step 0. I0212 12:06:59.752084 30928 supervisor.py:1050] Recording summary at step 0. INFO:tensorflow:global_step/sec: 0 I0212 12:07:01.083346 25436 supervisor.py:1099] global_step/sec: 0 INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, 2 root error(s) found. (0) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[node CheckNumerics (defined at C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] (1) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[node CheckNumerics (defined at C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] [[gradients/SecondStageBoxPredictor/ClassPredictor/BiasAdd_grad/BiasAddGrad/_2124]] 0 successful operations. 0 derived errors ignored.

Original stack trace for 'CheckNumerics': File "_5train.py", line 185, in tf.app.run() File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 299, in run _run_main(main, args) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 250, in _run_main sys.exit(main(argv)) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "_5train.py", line 181, in main graph_hook_fn=graph_rewriter_fn) File "E:\acuity\tuan_experiment\tensorflow\models\research\object_detection\legacy\trainer.py", line 323, in train total_loss = tf.check_numerics(total_loss, 'LossTensor is inf or nan.') File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 1011, in check_numerics "CheckNumerics", tensor=tensor, message=message, name=name) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

I0212 12:07:04.600844 27168 coordinator.py:224] Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, 2 root error(s) found. (0) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[node CheckNumerics (defined at C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] (1) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[node CheckNumerics (defined at C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] [[gradients/SecondStageBoxPredictor/ClassPredictor/BiasAdd_grad/BiasAddGrad/_2124]] 0 successful operations. 0 derived errors ignored.

Original stack trace for 'CheckNumerics': File "_5train.py", line 185, in tf.app.run() File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 299, in run _run_main(main, args) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 250, in _run_main sys.exit(main(argv)) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "_5train.py", line 181, in main graph_hook_fn=graph_rewriter_fn) File "E:\acuity\tuan_experiment\tensorflow\models\research\object_detection\legacy\trainer.py", line 323, in train total_loss = tf.check_numerics(total_loss, 'LossTensor is inf or nan.') File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 1011, in check_numerics "CheckNumerics", tensor=tensor, message=message, name=name) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

Traceback (most recent call last): File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[{{node CheckNumerics}}]] (1) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[{{node CheckNumerics}}]] [[gradients/SecondStageBoxPredictor/ClassPredictor/BiasAdd_grad/BiasAddGrad/_2124]] 0 successful operations. 0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "_5train.py", line 185, in tf.app.run() File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 299, in run _run_main(main, args) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 250, in _run_main sys.exit(main(argv)) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "_5train.py", line 181, in main graph_hook_fn=graph_rewriter_fn) File "E:\acuity\tuan_experiment\tensorflow\models\research\object_detection\legacy\trainer.py", line 417, in train saver=saver) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\contrib\slim\python\slim\learning.py", line 775, in train train_step_kwargs) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\contrib\slim\python\slim\learning.py", line 490, in train_step run_metadata=run_metadata) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[node CheckNumerics (defined at C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] (1) Invalid argument: LossTensor is inf or nan. : Tensor had NaN values [[node CheckNumerics (defined at C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] [[gradients/SecondStageBoxPredictor/ClassPredictor/BiasAdd_grad/BiasAddGrad/_2124]] 0 successful operations. 0 derived errors ignored.

Original stack trace for 'CheckNumerics': File "_5train.py", line 185, in tf.app.run() File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 299, in run _run_main(main, args) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 250, in _run_main sys.exit(main(argv)) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "_5train.py", line 181, in main graph_hook_fn=graph_rewriter_fn) File "E:\acuity\tuan_experiment\tensorflow\models\research\object_detection\legacy\trainer.py", line 323, in train total_loss = tf.check_numerics(total_loss, 'LossTensor is inf or nan.') File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 1011, in check_numerics "CheckNumerics", tensor=tensor, message=message, name=name) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs)

File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\OsawaYuta\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

nguyenanhtuan1008 avatar Feb 12 '20 03:02 nguyenanhtuan1008

same problem with you,any ideas to solve it?

xuanzhiliu avatar Nov 04 '22 05:11 xuanzhiliu