PointRCNN icon indicating copy to clipboard operation
PointRCNN copied to clipboard

Problem in RCNN.ROI_SAMPLE_JIT=False

Open yinjunbo opened this issue 5 years ago • 13 comments

Thanks a lot for your fabulous work ! Your code works well when ROI_SAMPLE_JIT=True. But a problem happens in /lib/net/rcnn_net.py when ROI_SAMPLE_JIT = False, which may relate to tensor size :

  xyz_input = pts_input[..., 0:self.rcnn_input_channel].transpose(1, 2).unsqueeze(dim=3)
  xyz_feature = self.xyz_up_layer(xyz_input)
  rpn_feature = pts_input[..., self.rcnn_input_channel:].transpose(1, 2).unsqueeze(dim=3)
  merged_feature = torch.cat((xyz_feature, rpn_feature), dim=1)
  merged_feature = self.merge_down_layer(merged_feature)
  l_xyz, l_features = [xyz], [merged_feature.squeeze(dim=3)]

In my case(default as you recommend), the processed xyz_input is a [4, 512, 64, 1, 5] tensor, which can't be processed by the sharedMLP(128,5,1,1 conv indeed): RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 5, 1, 1], but got 5-dimensional input of size [4, 512, 64, 1, 5] instead

yinjunbo avatar May 21 '19 12:05 yinjunbo

Have you followed all the instructions and kept the original codes? I have checked the codes and doesn't find this problem. Maybe you could check the shape by comparing with results from ROI_SAMPLE_JIT=True.

sshaoshuai avatar May 23 '19 05:05 sshaoshuai

I wonder the accuracy difference between model trained by ROI_SAMPLE_JIT=True and ROI_SAMPLE_JIT=False . And is it the pre-trained model you afford exactly obtained by ROI_SAMPLE_JIT=False ? In my case, the best result of models trained with ROI_SAMPLE_JIT=True on validation set is around 87.31(E), 77.68(M), 77.07(H).

yinjunbo avatar May 23 '19 08:05 yinjunbo

No, the released code is more clear than the original code since I have re-implemented it to make it easy to understand, but they have exactly the same network. The pretrained model is trained by saving the RoIs and the features out, which are then used by training the RCNN stage. I think the main differences are that I shuffle all the RoIs when training RCNN stage (it needs 50G+ CPU memory to load all the saved features and lots of people don't have this condition, so I re-implemented it like the released codes to benefit more peoples. ) I have trained several networks before I released the codes, and the best moderate 3D AP is about 78.4 with ROI_SAMPLE_JIT=False.

sshaoshuai avatar May 23 '19 08:05 sshaoshuai

I've tried it twice that training with ROI_SAMPLE_JIT=False and exactly followed all the instructions, however, every time it reports as bellow. RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 5, 1, 1], but got 5-dimensional input of size [4, 512, 64, 1, 5] instead. All the settings are exactly the same with the one ROI_SAMPLE_JIT=True , except for ROI_SAMPLE_JIT=False , so is there something I missed?

yinjunbo avatar May 26 '19 12:05 yinjunbo

No, the released code is more clear than the original code since I have re-implemented it to make it easy to understand, but they have exactly the same network. The pretrained model is trained by saving the RoIs and the features out, which are then used by training the RCNN stage. I think the main differences are that I shuffle all the RoIs when training RCNN stage (it needs 50G+ CPU memory to load all the saved features and lots of people don't have this condition, so I re-implemented it like the released codes to benefit more peoples. ) I have trained several networks before I released the codes, and the best moderate 3D AP is about 78.4 with ROI_SAMPLE_JIT=False.

Hi shaoshai, I actually faced the same problem. After checking the code, I found that it caused by the dimension of inputs if the ROI_SAMPLE_JIT=False. This can be easily solved by merge the batch dimension with the feature dimension.

Cheers,

dingfuzhou avatar Jun 10 '19 13:06 dingfuzhou

No, the released code is more clear than the original code since I have re-implemented it to make it easy to understand, but they have exactly the same network. The pretrained model is trained by saving the RoIs and the features out, which are then used by training the RCNN stage. I think the main differences are that I shuffle all the RoIs when training RCNN stage (it needs 50G+ CPU memory to load all the saved features and lots of people don't have this condition, so I re-implemented it like the released codes to benefit more peoples. ) I have trained several networks before I released the codes, and the best moderate 3D AP is about 78.4 with ROI_SAMPLE_JIT=False.

Hi shaoshuai Thank you for the nice paper and code. I faced the same problem. And I am just wondering if it is possible for us to learn more details about your implementation (e.g. how did you shuffle all the RoIs when training RCNN). Probably those details can further help us improve our models. Thanks a lot in advance!

wei-OHK avatar Jun 18 '19 11:06 wei-OHK

Sorry for the late. The RPN stage will firstly save 300 proposals for each scene (3712*4 scenes in total). All the proposals will be merged as the training set of the RCNN stage. For the training of RCNN stage, each training batch is randomly fetched from this new training set (300 * 3712 * 4) with positive/negative samples.

sshaoshuai avatar Jul 22 '19 04:07 sshaoshuai

I wonder the accuracy difference between model trained by ROI_SAMPLE_JIT=True and ROI_SAMPLE_JIT=False . And is it the pre-trained model you afford exactly obtained by ROI_SAMPLE_JIT=False ? In my case, the best result of models trained with ROI_SAMPLE_JIT=True on validation set is around 87.31(E), 77.68(M), 77.07(H).

Hi could you talk a bit more about your result in ROI_SAMPLE_JIT=True` on validation set is around 87.31(E), 77.68(M), 77.07(H). How did you get that . casue I can only get 85(E) 75(M) in default setting with ROI_SAMPLE_JIT=True and get 87(E) 76(M) when set ROI_SAMPLE_JIT=False

tsbiosky avatar Jul 26 '19 19:07 tsbiosky

All the settings are the same as the instruction. I test the trained model at different epoch (step), not the last one.

On Sat, Jul 27, 2019 at 03:14 tsbiosky [email protected] wrote:

I wonder the accuracy difference between model trained by ROI_SAMPLE_JIT=True and ROI_SAMPLE_JIT=False . And is it the pre-trained model you afford exactly obtained by ROI_SAMPLE_JIT=False ? In my case, the best result of models trained with ROI_SAMPLE_JIT=True on validation set is around 87.31(E), 77.68(M), 77.07(H).

Hi could you talk a bit more about your result in ROI_SAMPLE_JIT=True` on validation set is around 87.31(E), 77.68(M), 77.07(H). How did you get that . casue I can only get 85(E) 75(M) in default setting with ROI_SAMPLE_JIT=True and get 87(E) 76(M) when set ROI_SAMPLE_JIT=False

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sshaoshuai/PointRCNN/issues/44?email_source=notifications&email_token=AHCPIYM7GC3SMRJGPSWSOTTQBNEH7A5CNFSM4HOKQY6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25PF2I#issuecomment-515568361, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCPIYODU2WICSUOWPUSB3LQBNEH7ANCNFSM4HOKQY6A .

yinjunbo avatar Jul 27 '19 07:07 yinjunbo

hi Has anyone solved this problem? I met the same situation without changing anything

timetomakesense avatar Oct 25 '19 10:10 timetomakesense

I met the same problem as well...

zhaoyl18 avatar Dec 20 '19 01:12 zhaoyl18

This might be due to input shape inconsistency between two settings in which ROI_SAMPLE_JIT is turned on/off. Specifically, see the last few lines of code in proposal_target_layer.py: forward(), which did a merging operation of batch_size and num_points dimension. But this is only done when ROI_SAMPLE_JIT=True, as indicated at line 123 in rcnn_net.py.

I have got the code running with ROI_SAMPLE_JIT=False with the following fixes:

  1. Add reshape operations train_functions.py: get_rcnn_loss() function. Change lines 125-129 to: reg_valid_mask = ret_dict['reg_valid_mask'].view(-1) roi_boxes3d = ret_dict['roi_boxes3d'].view(-1, 7) roi_size = roi_boxes3d[:, 3:6] gt_boxes3d_ct = ret_dict['gt_of_rois'].view(-1, 7)

  2. Add reshape operations to pts_input in rcnn_net.py: forward() function. Insert the following two lines after line 163: pts_input = pts_input.view(-1, cfg.RCNN.NUM_POINTS, pts_input.shape[-1]) target_dict['pts_input'] = pts_input

chengdazhi avatar Feb 11 '20 23:02 chengdazhi

1.in the "get_rcnn_training_sample_batch() of kitti_rcnn_dataset.py" before the sample_inpuf ={}, add: pts_features = np.concatenate((pts_input_ct[:,:,3:],pts_features), axis=2)
pts_input_ct = pts_input_ct[:,:,0:3] 2.in the "model_fn() of train_functions.py" under the condition "if not cfg.RCNN.ROI_SAMPLE_JIT:", add input_data['pts_input'] = pts_input input_data['pts_input'] = input_data['pts_input'].view(-1, cfg.RCNN.NUM_POINTS, input_data['pts_input'].shape[3]) input_data['pts_features'] = input_data['pts_features'].view(-1, cfg.RCNN.NUM_POINTS, input_data['pts_features'].shape[3]) input_data['cls_label'] = input_data['cls_label'].view(-1) input_data['reg_valid_mask'] = input_data['reg_valid_mask'].view(-1) input_data['gt_boxes3d_ct'] = input_data['gt_boxes3d_ct'].view(-1,7) input_data['roi_boxes3d'] = input_data['roi_boxes3d'].view(-1,7)

xjjs avatar Feb 25 '20 04:02 xjjs