About num_deformable_group ?
Hi, i have implemented this paper from mxnet to caffe and it has 7% gain at [email protected] when num_deformable_group = 1, channel = 18. But, it has poor performance when num_deformable_group = 4, channel =72. Did you experiment the impact at different num_deformable_group ? i find not some info about num_deformable_group from the paper? thanks a lot..
Did you do the same experiment on MXNet? Will there be such a phenomenon? @louyanyang
@Franciszzj I do this experiment on Mxnet,but i modified the symbols. when training, the RCNNAcc was abnormal. i dont known if the symbols is wrong or not.
@louyanyang Could you show some log?
@Franciszzj 'MXNET_VERSION': 'mxnet', 'SCALES': [(600, 1000)], 'TEST': {'BATCH_IMAGES': 1, 'CXX_PROPOSAL': False, 'HAS_RPN': True, 'NMS': 0.3, 'PROPOSAL_MIN_SIZE': 0, 'PROPOSAL_NMS_THRESH': 0.7, 'PROPOSAL_POST_NMS_TOP_N': 2000, 'PROPOSAL_PRE_NMS_TOP_N': 20000, 'RPN_MIN_SIZE': 0, 'RPN_NMS_THRESH': 0.7, 'RPN_POST_NMS_TOP_N': 300, 'RPN_PRE_NMS_TOP_N': 6000, 'max_per_image': 300, 'test_epoch': 7}, 'TRAIN': {'ALTERNATE': {'RCNN_BATCH_IMAGES': 0, 'RPN_BATCH_IMAGES': 0, 'rfcn1_epoch': 0, 'rfcn1_lr': 0, 'rfcn1_lr_step': '', 'rfcn2_epoch': 0, 'rfcn2_lr': 0, 'rfcn2_lr_step': '', 'rpn1_epoch': 0, 'rpn1_lr': 0, 'rpn1_lr_step': '', 'rpn2_epoch': 0, 'rpn2_lr': 0, 'rpn2_lr_step': '', 'rpn3_epoch': 0, 'rpn3_lr': 0, 'rpn3_lr_step': ''}, 'ASPECT_GROUPING': True, 'BATCH_IMAGES': 1, 'BATCH_ROIS': -1, 'BATCH_ROIS_OHEM': 128, 'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0], 'BBOX_NORMALIZATION_PRECOMPUTED': True, 'BBOX_REGRESSION_THRESH': 0.5, 'BBOX_STDS': [0.1, 0.1, 0.2, 0.2], 'BBOX_WEIGHTS': array([ 1., 1., 1., 1.]), 'BG_THRESH_HI': 0.5, 'BG_THRESH_LO': 0.0, 'CXX_PROPOSAL': False, 'ENABLE_OHEM': True, 'END2END': True, 'FG_FRACTION': 0.25, 'FG_THRESH': 0.5, 'FLIP': True, 'RESUME': False, 'RPN_BATCH_SIZE': 256, 'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0], 'RPN_CLOBBER_POSITIVES': False, 'RPN_FG_FRACTION': 0.5, 'RPN_MIN_SIZE': 0, 'RPN_NEGATIVE_OVERLAP': 0.3, 'RPN_NMS_THRESH': 0.7, 'RPN_POSITIVE_OVERLAP': 0.7, 'RPN_POSITIVE_WEIGHT': -1.0, 'RPN_POST_NMS_TOP_N': 300, 'RPN_PRE_NMS_TOP_N': 6000, 'SHUFFLE': True, 'begin_epoch': 0, 'end_epoch': 7, 'lr': 0.0005, 'lr_factor': 0.1, 'lr_step': '4.83', 'model_prefix': 'rfcn_voc', 'momentum': 0.9, 'warmup': True, 'warmup_lr': 5e-05, 'warmup_step': 1000, 'wd': 0.0005}, 'dataset': {'NUM_CLASSES': 21, 'dataset': 'PascalVOC', 'dataset_path': '/home/data/VOCdevkit0712', 'image_set': '2007_trainval', 'proposal': 'rpn', 'root_path': '/home/data', 'test_image_set': '2007_test'}, 'default': {'frequent': 100, 'kvstore': 'device'}, 'gpus': '0,1,2,3', 'network': {'ANCHOR_RATIOS': [0.5, 1, 2], 'ANCHOR_SCALES': [8, 16, 32], 'FIXED_PARAMS': ['conv1', 'bn_conv1', 'res2', 'bn2', 'gamma', 'beta'], 'FIXED_PARAMS_SHARED': ['conv1', 'bn_conv1', 'res2', 'bn2', 'res3', 'bn3', 'res4', 'bn4', 'gamma', 'beta'], 'IMAGE_STRIDE': 0, 'NUM_ANCHORS': 9, 'PIXEL_MEANS': array([ 103.06, 115.9 , 123.15]), 'RCNN_FEAT_STRIDE': 16, 'RPN_FEAT_STRIDE': 16, 'pretrained': './model/pretrained_model/resnet_v1_18cut', 'pretrained_epoch': 0}, 'output_path': './output/rfcn_dcn/voc', 'symbol': 'resnet_v1_18cut_rfcn_dcn'}
2017-08-12 10:57:29,403 Epoch[0] Batch [7800] Speed: 18.71 samples/sec Train-RPNAcc=0.931299, RPNLogLoss=0.206756, RPNL1Loss=0.060934, RCNNAcc=0.755570, RCNNLogLoss=2.975048, RCNNL1Loss=0.559008, 2017-08-12 10:57:53,883 Epoch[0] Batch [7900] Speed: 16.34 samples/sec Train-RPNAcc=0.931406, RPNLogLoss=0.206176, RPNL1Loss=0.060848, RCNNAcc=0.755273, RCNNLogLoss=2.975192, RCNNL1Loss=0.559908, 2017-08-12 10:58:18,746 Epoch[0] Batch [8000] Speed: 16.09 samples/sec Train-RPNAcc=0.931538, RPNLogLoss=0.205546, RPNL1Loss=0.060706, RCNNAcc=0.755418, RCNNLogLoss=2.975352, RCNNL1Loss=0.559955, 2017-08-12 10:58:39,768 Epoch[0] Batch [8100] Speed: 19.03 samples/sec Train-RPNAcc=0.931549, RPNLogLoss=0.205172, RPNL1Loss=0.060746, RCNNAcc=0.755255, RCNNLogLoss=2.975434, RCNNL1Loss=0.561019, 2017-08-12 10:59:06,217 Epoch[0] Batch [8200] Speed: 15.12 samples/sec Train-RPNAcc=0.931624, RPNLogLoss=0.204680, RPNL1Loss=0.060701, RCNNAcc=0.754972, RCNNLogLoss=2.975548, RCNNL1Loss=0.562207, 2017-08-12 10:59:25,351 Epoch[0] Train-RPNAcc=0.931653 2017-08-12 10:59:25,351 Epoch[0] Train-RPNLogLoss=0.204439 2017-08-12 10:59:25,352 Epoch[0] Train-RPNL1Loss=0.060678 2017-08-12 10:59:25,352 Epoch[0] Train-RCNNAcc=0.754880 2017-08-12 10:59:25,352 Epoch[0] Train-RCNNLogLoss=2.975763 2017-08-12 10:59:25,352 Epoch[0] Train-RCNNL1Loss=0.562753
2017-08-12 11:28:20,037 Epoch[1] Batch [7800] Speed: 17.18 samples/sec Train-RPNAcc=0.939234, RPNLogLoss=0.160226, RPNL1Loss=0.055216, RCNNAcc=0.732916, RCNNLogLoss=3.022581, RCNNL1Loss=0.638416, 2017-08-12 11:28:43,553 Epoch[1] Batch [7900] Speed: 17.01 samples/sec Train-RPNAcc=0.939206, RPNLogLoss=0.160265, RPNL1Loss=0.055325, RCNNAcc=0.733160, RCNNLogLoss=3.022687, RCNNL1Loss=0.637717, 2017-08-12 11:29:04,124 Epoch[1] Batch [8000] Speed: 19.45 samples/sec Train-RPNAcc=0.939239, RPNLogLoss=0.160168, RPNL1Loss=0.055259, RCNNAcc=0.733067, RCNNLogLoss=3.022723, RCNNL1Loss=0.637984, 2017-08-12 11:29:27,943 Epoch[1] Batch [8100] Speed: 16.79 samples/sec Train-RPNAcc=0.939324, RPNLogLoss=0.160049, RPNL1Loss=0.055227, RCNNAcc=0.732937, RCNNLogLoss=3.022761, RCNNL1Loss=0.638359, 2017-08-12 11:29:51,717 Epoch[1] Batch [8200] Speed: 16.83 samples/sec Train-RPNAcc=0.939405, RPNLogLoss=0.159860, RPNL1Loss=0.055173, RCNNAcc=0.732822, RCNNLogLoss=3.022826, RCNNL1Loss=0.638635, 2017-08-12 11:30:07,155 Epoch[1] Train-RPNAcc=0.939454 2017-08-12 11:30:07,156 Epoch[1] Train-RPNLogLoss=0.159763 2017-08-12 11:30:07,156 Epoch[1] Train-RPNL1Loss=0.055085 2017-08-12 11:30:07,156 Epoch[1] Train-RCNNAcc=0.732718 2017-08-12 11:30:07,156 Epoch[1] Train-RCNNLogLoss=3.022856 2017-08-12 11:30:07,156 Epoch[1] Train-RCNNL1Loss=0.639103
2017-08-12 11:58:18,838 Epoch[2] Batch [7600] Speed: 19.79 samples/sec Train-RPNAcc=0.942471, RPNLogLoss=0.151330, RPNL1Loss=0.053603, RCNNAcc=0.728250, RCNNLogLoss=3.031135, RCNNL1Loss=0.651917, 2017-08-12 11:58:38,545 Epoch[2] Batch [7700] Speed: 20.30 samples/sec Train-RPNAcc=0.942523, RPNLogLoss=0.151235, RPNL1Loss=0.053530, RCNNAcc=0.728238, RCNNLogLoss=3.031140, RCNNL1Loss=0.652170, 2017-08-12 11:58:58,649 Epoch[2] Batch [7800] Speed: 19.90 samples/sec Train-RPNAcc=0.942593, RPNLogLoss=0.151132, RPNL1Loss=0.053535, RCNNAcc=0.728009, RCNNLogLoss=3.031160, RCNNL1Loss=0.652626, 2017-08-12 11:59:19,213 Epoch[2] Batch [7900] Speed: 19.45 samples/sec Train-RPNAcc=0.942632, RPNLogLoss=0.151045, RPNL1Loss=0.053466, RCNNAcc=0.728039, RCNNLogLoss=3.031153, RCNNL1Loss=0.652379, 2017-08-12 11:59:38,988 Epoch[2] Batch [8000] Speed: 20.23 samples/sec Train-RPNAcc=0.942654, RPNLogLoss=0.150982, RPNL1Loss=0.053447, RCNNAcc=0.727878, RCNNLogLoss=3.031105, RCNNL1Loss=0.652638, 2017-08-12 11:59:58,650 Epoch[2] Batch [8100] Speed: 20.34 samples/sec Train-RPNAcc=0.942707, RPNLogLoss=0.150888, RPNL1Loss=0.053432, RCNNAcc=0.727840, RCNNLogLoss=3.031116, RCNNL1Loss=0.652655, 2017-08-12 12:00:21,606 Epoch[2] Batch [8200] Speed: 17.42 samples/sec Train-RPNAcc=0.942731, RPNLogLoss=0.150829, RPNL1Loss=0.053482, RCNNAcc=0.727794, RCNNLogLoss=3.031144, RCNNL1Loss=0.652613, 2017-08-12 12:00:39,040 Epoch[2] Train-RPNAcc=0.942761 2017-08-12 12:00:39,040 Epoch[2] Train-RPNLogLoss=0.150745 2017-08-12 12:00:39,040 Epoch[2] Train-RPNL1Loss=0.053476 2017-08-12 12:00:39,040 Epoch[2] Train-RCNNAcc=0.727663 2017-08-12 12:00:39,040 Epoch[2] Train-RCNNLogLoss=3.031165 2017-08-12 12:00:39,040 Epoch[2] Train-RCNNL1Loss=0.652926
2017-08-12 13:26:52,488 Epoch[5] Batch [7700] Speed: 17.69 samples/sec Train-RPNAcc=0.949390, RPNLogLoss=0.132628, RPNL1Loss=0.048967, RCNNAcc=0.715907, RCNNLogLoss=3.033916, RCNNL1Loss=0.668825, 2017-08-12 13:27:14,880 Epoch[5] Batch [7800] Speed: 17.86 samples/sec Train-RPNAcc=0.949345, RPNLogLoss=0.132680, RPNL1Loss=0.048992, RCNNAcc=0.715927, RCNNLogLoss=3.033940, RCNNL1Loss=0.668962, 2017-08-12 13:27:35,643 Epoch[5] Batch [7900] Speed: 19.27 samples/sec Train-RPNAcc=0.949325, RPNLogLoss=0.132733, RPNL1Loss=0.049038, RCNNAcc=0.716060, RCNNLogLoss=3.033987, RCNNL1Loss=0.668699, 2017-08-12 13:27:58,896 Epoch[5] Batch [8000] Speed: 17.20 samples/sec Train-RPNAcc=0.949319, RPNLogLoss=0.132781, RPNL1Loss=0.049042, RCNNAcc=0.716242, RCNNLogLoss=3.033997, RCNNL1Loss=0.668439, 2017-08-12 13:28:22,145 Epoch[5] Batch [8100] Speed: 17.20 samples/sec Train-RPNAcc=0.949335, RPNLogLoss=0.132731, RPNL1Loss=0.049081, RCNNAcc=0.716347, RCNNLogLoss=3.034016, RCNNL1Loss=0.668182, 2017-08-12 13:28:42,667 Epoch[5] Batch [8200] Speed: 19.49 samples/sec Train-RPNAcc=0.949316, RPNLogLoss=0.132768, RPNL1Loss=0.049059, RCNNAcc=0.716371, RCNNLogLoss=3.034017, RCNNL1Loss=0.668282, 2017-08-12 13:28:57,680 Epoch[5] Train-RPNAcc=0.949330 2017-08-12 13:28:57,680 Epoch[5] Train-RPNLogLoss=0.132732 2017-08-12 13:28:57,680 Epoch[5] Train-RPNL1Loss=0.049075 2017-08-12 13:28:57,680 Epoch[5] Train-RCNNAcc=0.716455 2017-08-12 13:28:57,680 Epoch[5] Train-RCNNLogLoss=3.034010 2017-08-12 13:28:57,680 Epoch[5] Train-RCNNL1Loss=0.668071
2017-08-12 13:56:58,643 Epoch[6] Batch [7700] Speed: 18.38 samples/sec Train-RPNAcc=0.949658, RPNLogLoss=0.131949, RPNL1Loss=0.049008, RCNNAcc=0.719020, RCNNLogLoss=3.034508, RCNNL1Loss=0.667635, 2017-08-12 13:57:20,066 Epoch[6] Batch [7800] Speed: 18.67 samples/sec Train-RPNAcc=0.949613, RPNLogLoss=0.132009, RPNL1Loss=0.049037, RCNNAcc=0.719066, RCNNLogLoss=3.034498, RCNNL1Loss=0.667675, 2017-08-12 13:57:41,680 Epoch[6] Batch [7900] Speed: 18.51 samples/sec Train-RPNAcc=0.949648, RPNLogLoss=0.131989, RPNL1Loss=0.049021, RCNNAcc=0.719084, RCNNLogLoss=3.034518, RCNNL1Loss=0.667912, 2017-08-12 13:58:01,598 Epoch[6] Batch [8000] Speed: 20.08 samples/sec Train-RPNAcc=0.949654, RPNLogLoss=0.131970, RPNL1Loss=0.049005, RCNNAcc=0.719224, RCNNLogLoss=3.034519, RCNNL1Loss=0.667619, 2017-08-12 13:58:22,971 Epoch[6] Batch [8100] Speed: 18.72 samples/sec Train-RPNAcc=0.949646, RPNLogLoss=0.131959, RPNL1Loss=0.048923, RCNNAcc=0.719403, RCNNLogLoss=3.034529, RCNNL1Loss=0.667227, 2017-08-12 13:58:44,812 Epoch[6] Batch [8200] Speed: 18.31 samples/sec Train-RPNAcc=0.949662, RPNLogLoss=0.131902, RPNL1Loss=0.048957, RCNNAcc=0.719327, RCNNLogLoss=3.034544, RCNNL1Loss=0.667533, 2017-08-12 13:58:59,598 Epoch[6] Train-RPNAcc=0.949654 2017-08-12 13:58:59,598 Epoch[6] Train-RPNLogLoss=0.131898 2017-08-12 13:58:59,598 Epoch[6] Train-RPNL1Loss=0.048909 2017-08-12 13:58:59,598 Epoch[6] Train-RCNNAcc=0.719371 2017-08-12 13:58:59,598 Epoch[6] Train-RCNNLogLoss=3.034529 2017-08-12 13:58:59,598 Epoch[6] Train-RCNNL1Loss=0.667335
It looks like RCNN is adjusting to something, Train more epoches, it may rise. @louyanyang
@Franciszzj thanks for your reply
@louyanyang What does the hyper-parameter num_deformable_group mean? For different channels in the input data, the offset is different? According to the paper, the offset should be the same and #filter of offset = 2 x kernel_width x kernel_height?
@wenqingchu offset channels = num_deformable_group * 2 * kernel_height * kernel_width
@louyanyang could you share your implementation based on caffe ? thanks~~~
@louyanyang , what about your experiment result after more epoches? does num_deformable_group affect much ?
@Fragrance307 so sorry, it's classified to share codes for company rules.
more epoches is also bad. num_deformable_group affects much in my implementation based on caffe.
@louyanyang , got it, tks!
@YuwenXiong , the paper did not mention the deformable_group param, have you done any experiments of this?
@louyanyang Sorry, I still don't understand the meaning of num_deformable_group...In which case num_deformable_group > 1 is needed?