FastMaskRCNN How to load resnet-101? (poor result based on resnet-50)

I trained the model with resnet-50 for 200k iterations. But the result is very poor. I wonder if we should use the resnet-101 as the original Mask-RCNN paper?

Oct 12 '17 13:10 onlytailei

I think 200k is too small iteration to see any good result. I assume you tried with batch size 1. In the mask rcnn paper, it trained over 160k with effective batch size 16. I think, we should train 16*160k = 2560k at least with batch size 1. Could you share how loss is decreased over time or how accuracy is increased over the 200k iteration?

Oct 13 '17 05:10 insikk

Hi, It looks like you were able to successfully completed training. I am trying to start training, but i am getting error Caused by op u'pyramid_1/AssignGTBoxes/Where_3', defined at: File "train/train.py", line 339, in train() File "train/train.py", line 193, in train loss_weights=[0.2, 0.2, 1.0, 0.2, 1.0]) File "train/../libs/nets/pyramid_network.py", line 580, in build is_training=is_training, gt_boxes=gt_boxes) File "train/../libs/nets/pyramid_network.py", line 263, in build_heads assign_boxes(rois, [rois, batch_inds], [2, 3, 4, 5]) File "train/../libs/layers/wrapper.py", line 172, in assign_boxes inds = tf.where(tf.equal(assigned_layers, l)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2365, in where return gen_array_ops.where(input=condition, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 4053, in where result = _op_def_lib.apply_op("Where", input=input, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InternalError (see above for traceback): WhereOp: Could not launch cub::DeviceReduce::Sum to count number of true indices. temp_storage_bytes: 1, status: invalid device function [[Node: pyramid_1/AssignGTBoxes/Where_3 = Where_device="/job:localhost/replica:0/task:0/gpu:0"]] [[Node: pyramid_1/fully_connected_3/BiasAdd/_2753 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_27365_pyramid_1/fully_connected_3/BiasAdd", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Feb 08 '18 07:02 satya2550

FastMaskRCNN FastMaskRCNN copied to clipboard

How to load resnet-101? (poor result based on resnet-50)

FastMaskRCNN
FastMaskRCNN copied to clipboard