text-detection-ctpn icon indicating copy to clipboard operation
text-detection-ctpn copied to clipboard

Inssue in custom dataset traing

Open cxf712 opened this issue 6 years ago • 1 comments

When I trainon my own data set,I encountered the following problem: tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had NaN values [[node VerifyFinite/CheckNumerics (defined at ../lib/fast_rcnn/train.py:107) = CheckNumericsT=DT_FLOAT, message="Found Inf or NaN global norm.", _device="/job:localhost/replica:0/task:0/device:GPU:0"]] [[{{node Adam/update/_136}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2359_Adam/update", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]] Caused by op 'VerifyFinite/CheckNumerics', defined at: File "train_net.py", line 40, in restore=bool(int(cfg.TRAIN.restore))) File "../lib/fast_rcnn/train.py", line 225, in train_net sw.train_model(sess, max_iters, restore=restore) File "../lib/fast_rcnn/train.py", line 107, in train_model grads, norm = tf.clip_by_global_norm(tf.gradients(total_loss, tvars), 10.0) File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/ops/clip_ops.py", line 265, in clip_by_global_norm "Found Inf or NaN global norm.") File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/ops/numerics.py", line 47, in verify_tensor_all_finite verify_input = array_ops.check_numerics(t, message=msg) File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 817, in check_numerics "CheckNumerics", tensor=tensor, message=message, name=name) File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/root/anaconda3/envs/tensorflow-1.8/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Found Inf or NaN global norm. : Tensor had NaN values [[node VerifyFinite/CheckNumerics (defined at ../lib/fast_rcnn/train.py:107) = CheckNumericsT=DT_FLOAT, message="Found Inf or NaN global norm.", _device="/job:localhost/replica:0/task:0/device:GPU:0"]] [[{{node Adam/update/_136}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2359_Adam/update", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

cxf712 avatar Jan 08 '19 08:01 cxf712

Have you solve this problem?

Mrlei-go avatar Jun 25 '21 05:06 Mrlei-go