simple-faster-rcnn-pytorch About RPN

Thank you for your contribution, can I extract the RPN module separately?Remove Faster RCNN classification steps?

Jan 27 '18 07:01 ChuckGithub

I think it would be ok if you only want to use it in typical scene (e.g. binary classification).

Jan 27 '18 08:01 chenyuntc

ok,Thanks

Jan 27 '18 08:01 ChuckGithub

I want to detect one category, what code needs to be changed?

Jan 27 '18 11:01 ChuckGithub

I think the easiest way is to rewrite your dataset. Keep RoIHead and RPN unmodified.

Jan 27 '18 12:01 chenyuntc

好的,谢谢了,最近买了大神你写的<<深度学习框架PyTorch入门与实践>>那本书在学习,刚好用这个项目练练手,遇到问题还得向你多多请教

Jan 27 '18 12:01 ChuckGithub

:smile:

Jan 27 '18 12:01 chenyuntc

我想把RPN分离出来只做二分类的,但是执行到下面一句出了问题,能不能帮我分析一下,谢谢了 mean = t.Tensor(self.loc_normalize_mean).cuda().repeat(self.n_class)[None] std = t.Tensor(self.loc_normalize_std).cuda().repeat(self.n_class)[None] roi_cls_loc = (roi_cls_loc * std + mean) mean和std都是1x8的,roi_cls_loc是1x16650x4的

Jan 29 '18 09:01 ChuckGithub

mean和std 应该是1x4的

Jan 29 '18 10:01 chenyuntc

但是我把代码改成这样,这行代码roi = roi.view(-1, 1, 4).expand_as(roi_cls_loc)报的这样的错误RuntimeError: The expanded size of the tensor (8325) must match the existing size (300) at non-singleton dimension 0.

mean = t.Tensor(self.loc_normalize_mean).cuda().repeat(self.n_class-1)[None]
std = t.Tensor(self.loc_normalize_std).cuda().repeat(self.n_class-1)[None]
roi_cls_loc = (roi_cls_loc * std + mean)
roi_cls_loc = roi_cls_loc.view(-1, self.n_class, 4)
roi = roi.view(-1, 1, 4).expand_as(roi_cls_loc)

Jan 29 '18 11:01 ChuckGithub

需要打印一下 roi_cls_loc 和 roi 的形状分析一下.

Jan 29 '18 14:01 chenyuntc

roi是300x4,roi_cls_loc是8325x2x4

Jan 29 '18 14:01 ChuckGithub

roi应该是300x1x4,roi_cls_loc应该是300x2x4

是不是其它地方弄错了？

Jan 30 '18 01:01 chenyuntc

好的,我再看看,谢谢了

Jan 30 '18 06:01 ChuckGithub

@chenyuntc Do you mean if we have a binary classification then we keep the RPN but then can chop off the ROIHead (since conditional on a box containing an object it will always be object-0)? To improve speed.

Mar 16 '18 02:03 ilkarman

If It's only a Binary classification, You could use RPN without ROIHEAD. But it's better to keep the ROIHEad, since the RoIPooling Crop the corresponding area from the input images. Anchor calculated in RPN only have one corresponding pixel in the feature map, which means that it may have too narrow receptive fields.

Mar 16 '18 14:03 chenyuntc

@chenyuntc Interesting, thank you. Just to play, I have removed the head and now loss is just:

losses = [rpn_loc_loss, rpn_cls_loss]

However, the issue comes with the scores for RPN:

def try_rpn_prediction(img_id):
    # Eval model
    trainer.faster_rcnn.eval()
    # Forward pass through RPN only
    _rpn_locs, rpn_scores, rois, _roi_indices, _anchors, roi_order = trainer.faster_rcnn(img_id)    
    rpn_scores = at.tonumpy(F.softmax(at.tovariable(np.squeeze(rpn_scores.data)), dim=1))
    rpn_scores = np.squeeze(rpn_scores)[:,1]  # fore-ground class
    # Return the RPN scores only for relevant ROIs (i.e. bounding boxes)
    return rois, rpn_scores[roi_order]

The roi_order variable I add to ProposalCreator class to keep track of which scores apply to which ROI:

    def __call__(self, loc, score,
        if self.parent_model.training:
            n_pre_nms = self.n_train_pre_nms
            n_post_nms = self.n_train_post_nms
        else:
            n_pre_nms = self.n_test_pre_nms
            n_post_nms = self.n_test_post_nms
        # Convert anchors into proposal via bbox transformations.
        # roi = loc2bbox(anchor, loc)
        roi = loc2bbox(anchor, loc)
        # Clip predicted boxes to image.
        roi[:, slice(0, 4, 2)] = np.clip(
            roi[:, slice(0, 4, 2)], 0, img_size[0])
        roi[:, slice(1, 4, 2)] = np.clip(
            roi[:, slice(1, 4, 2)], 0, img_size[1])

        # Remove predicted boxes with either height or width < threshold.
        min_size = self.min_size * scale
        hs = roi[:, 2] - roi[:, 0]
        ws = roi[:, 3] - roi[:, 1]
        keep = np.where((hs >= min_size) & (ws >= min_size))[0]
        
        roi = roi[keep, :]
        score = score[keep]
        roi_index = keep

        # Sort all (proposal, score) pairs by score from highest to lowest.
        # Take top pre_nms_topN (e.g. 6000).
        order = score.ravel().argsort()[::-1]
        if n_pre_nms > 0:
            order = order[:n_pre_nms]

        roi = roi[order, :]
        # Edit: apply to roi_index
        roi_index = roi_index[order]
        
        # Apply nms (e.g. threshold = 0.7).
        # Take after_nms_topN (e.g. 300).

        # unNOTE: somthing is wrong here!
        # TODO: remove cuda.to_gpu
        keep = non_maximum_suppression(
            cp.ascontiguousarray(cp.asarray(roi)),
            thresh=self.nms_thresh)
        if n_post_nms > 0:
            keep = keep[:n_post_nms]
            
        roi = roi[keep]
        # Edit: apply to roi_index
        roi_index = roi_index[keep]
        return roi, roi_index

However, this still doesn't quite match up. For example the rois are all sorted from most likely to least. However, indexing by this order variable I still notice some entries below have higher probability (although very few).

Related to your last part you mean the classification accuracy for one-class would be improved for big objects if ROIHEAD is included?

Thanks very much!

Mar 19 '18 19:03 ilkarman

Sorry just to ask further about your comment:

But it's better to keep the ROIHEad, since the RoIPooling Crop the corresponding area from the input images. Anchor calculated in RPN only have one corresponding pixel in the feature map, which means that it may have too narrow receptive fields.

I don't quite understand that. I am using ResNet50 with your code which has a stride of 224/7=32 so one pixel on feature-map is 32 pixels in the image. If I use just RPN can I compensate by adding bigger/smaller anchor sizes?

Mar 19 '18 22:03 ilkarman

@chenyuntc Sorry to triple post but just wanted to bring this quote up from Kaiming

More importantly, RPN as a stand-alone pedestrian detector achieves an MR of 14.9% (Table 1). This result is competitive and is better than all but two stateof-the-art competitors on the Caltech dataset (Fig. 4). We note that unlike RoI pooling that may suffer from small regions, RPN is essentially based on fixedsize sliding windows (in a fully convolutional fashion) and thus avoids collapsing bins. RPN predicts small objects by using small anchors

It seems in their case for 1-class, RPN only was even better?

Mar 29 '18 15:03 ilkarman

The author is missing F.softmax for rpn scores. So the author used for ranking rpn is actually some raw numbers, not the expected scores

Aug 31 '18 01:08 FortiLeiZhang

As a two-stage model, Faster RCNN can train RPN and ROI individually using roi_cls_locs、roi_scores、rpn_locs, rpn_scores. But in your code，you only use the roi_cls_locs and roi_scores to train the ROI head，and not train the RPN.AM I right?

Jun 17 '19 19:06 YangYangGirl

simple-faster-rcnn-pytorch simple-faster-rcnn-pytorch copied to clipboard

About RPN

simple-faster-rcnn-pytorch
simple-faster-rcnn-pytorch copied to clipboard