simple-faster-rcnn-pytorch icon indicating copy to clipboard operation
simple-faster-rcnn-pytorch copied to clipboard

About RPN

Open ChuckGithub opened this issue 7 years ago • 20 comments

Thank you for your contribution, can I extract the RPN module separately?Remove Faster RCNN classification steps?

ChuckGithub avatar Jan 27 '18 07:01 ChuckGithub

I think it would be ok if you only want to use it in typical scene (e.g. binary classification).

chenyuntc avatar Jan 27 '18 08:01 chenyuntc

ok,Thanks

ChuckGithub avatar Jan 27 '18 08:01 ChuckGithub

I want to detect one category, what code needs to be changed?

ChuckGithub avatar Jan 27 '18 11:01 ChuckGithub

I think the easiest way is to rewrite your dataset. Keep RoIHead and RPN unmodified.

chenyuntc avatar Jan 27 '18 12:01 chenyuntc

好的,谢谢了,最近买了大神你写的<<深度学习框架PyTorch入门与实践>>那本书在学习,刚好用这个项目练练手,遇到问题还得向你多多请教

ChuckGithub avatar Jan 27 '18 12:01 ChuckGithub

:smile:

chenyuntc avatar Jan 27 '18 12:01 chenyuntc

我想把RPN分离出来只做二分类的,但是执行到下面一句出了问题,能不能帮我分析一下,谢谢了 mean = t.Tensor(self.loc_normalize_mean).cuda().repeat(self.n_class)[None] std = t.Tensor(self.loc_normalize_std).cuda().repeat(self.n_class)[None] roi_cls_loc = (roi_cls_loc * std + mean) mean和std都是1x8的,roi_cls_loc是1x16650x4的

ChuckGithub avatar Jan 29 '18 09:01 ChuckGithub

mean和std 应该是1x4的

chenyuntc avatar Jan 29 '18 10:01 chenyuntc

但是我把代码改成这样,这行代码roi = roi.view(-1, 1, 4).expand_as(roi_cls_loc)报的这样的错误RuntimeError: The expanded size of the tensor (8325) must match the existing size (300) at non-singleton dimension 0.

mean = t.Tensor(self.loc_normalize_mean).cuda().repeat(self.n_class-1)[None]
std = t.Tensor(self.loc_normalize_std).cuda().repeat(self.n_class-1)[None]
roi_cls_loc = (roi_cls_loc * std + mean)
roi_cls_loc = roi_cls_loc.view(-1, self.n_class, 4)
roi = roi.view(-1, 1, 4).expand_as(roi_cls_loc)

ChuckGithub avatar Jan 29 '18 11:01 ChuckGithub

需要打印一下 roi_cls_loc 和 roi 的形状分析一下.

chenyuntc avatar Jan 29 '18 14:01 chenyuntc

roi是300x4,roi_cls_loc是8325x2x4

ChuckGithub avatar Jan 29 '18 14:01 ChuckGithub

roi应该是300x1x4,roi_cls_loc应该是300x2x4

是不是其它地方弄错了?

chenyuntc avatar Jan 30 '18 01:01 chenyuntc

好的,我再看看,谢谢了

ChuckGithub avatar Jan 30 '18 06:01 ChuckGithub

@chenyuntc Do you mean if we have a binary classification then we keep the RPN but then can chop off the ROIHead (since conditional on a box containing an object it will always be object-0)? To improve speed.

ilkarman avatar Mar 16 '18 02:03 ilkarman

If It's only a Binary classification, You could use RPN without ROIHEAD. But it's better to keep the ROIHEad, since the RoIPooling Crop the corresponding area from the input images. Anchor calculated in RPN only have one corresponding pixel in the feature map, which means that it may have too narrow receptive fields.

chenyuntc avatar Mar 16 '18 14:03 chenyuntc

@chenyuntc Interesting, thank you. Just to play, I have removed the head and now loss is just:

losses = [rpn_loc_loss, rpn_cls_loss]

However, the issue comes with the scores for RPN:

def try_rpn_prediction(img_id):
    # Eval model
    trainer.faster_rcnn.eval()
    # Forward pass through RPN only
    _rpn_locs, rpn_scores, rois, _roi_indices, _anchors, roi_order = trainer.faster_rcnn(img_id)    
    rpn_scores = at.tonumpy(F.softmax(at.tovariable(np.squeeze(rpn_scores.data)), dim=1))
    rpn_scores = np.squeeze(rpn_scores)[:,1]  # fore-ground class
    # Return the RPN scores only for relevant ROIs (i.e. bounding boxes)
    return rois, rpn_scores[roi_order]

The roi_order variable I add to ProposalCreator class to keep track of which scores apply to which ROI:

    def __call__(self, loc, score,
        if self.parent_model.training:
            n_pre_nms = self.n_train_pre_nms
            n_post_nms = self.n_train_post_nms
        else:
            n_pre_nms = self.n_test_pre_nms
            n_post_nms = self.n_test_post_nms
        # Convert anchors into proposal via bbox transformations.
        # roi = loc2bbox(anchor, loc)
        roi = loc2bbox(anchor, loc)
        # Clip predicted boxes to image.
        roi[:, slice(0, 4, 2)] = np.clip(
            roi[:, slice(0, 4, 2)], 0, img_size[0])
        roi[:, slice(1, 4, 2)] = np.clip(
            roi[:, slice(1, 4, 2)], 0, img_size[1])

        # Remove predicted boxes with either height or width < threshold.
        min_size = self.min_size * scale
        hs = roi[:, 2] - roi[:, 0]
        ws = roi[:, 3] - roi[:, 1]
        keep = np.where((hs >= min_size) & (ws >= min_size))[0]
        
        roi = roi[keep, :]
        score = score[keep]
        roi_index = keep

        # Sort all (proposal, score) pairs by score from highest to lowest.
        # Take top pre_nms_topN (e.g. 6000).
        order = score.ravel().argsort()[::-1]
        if n_pre_nms > 0:
            order = order[:n_pre_nms]

        roi = roi[order, :]
        # Edit: apply to roi_index
        roi_index = roi_index[order]
        
        # Apply nms (e.g. threshold = 0.7).
        # Take after_nms_topN (e.g. 300).

        # unNOTE: somthing is wrong here!
        # TODO: remove cuda.to_gpu
        keep = non_maximum_suppression(
            cp.ascontiguousarray(cp.asarray(roi)),
            thresh=self.nms_thresh)
        if n_post_nms > 0:
            keep = keep[:n_post_nms]
            
        roi = roi[keep]
        # Edit: apply to roi_index
        roi_index = roi_index[keep]
        return roi, roi_index

However, this still doesn't quite match up. For example the rois are all sorted from most likely to least. However, indexing by this order variable I still notice some entries below have higher probability (although very few).

Related to your last part you mean the classification accuracy for one-class would be improved for big objects if ROIHEAD is included?

Thanks very much!

ilkarman avatar Mar 19 '18 19:03 ilkarman

Sorry just to ask further about your comment:

But it's better to keep the ROIHEad, since the RoIPooling Crop the corresponding area from the input images. Anchor calculated in RPN only have one corresponding pixel in the feature map, which means that it may have too narrow receptive fields.

I don't quite understand that. I am using ResNet50 with your code which has a stride of 224/7=32 so one pixel on feature-map is 32 pixels in the image. If I use just RPN can I compensate by adding bigger/smaller anchor sizes?

ilkarman avatar Mar 19 '18 22:03 ilkarman

@chenyuntc Sorry to triple post but just wanted to bring this quote up from Kaiming

More importantly, RPN as a stand-alone pedestrian detector achieves an MR of 14.9% (Table 1). This result is competitive and is better than all but two stateof-the-art competitors on the Caltech dataset (Fig. 4). We note that unlike RoI pooling that may suffer from small regions, RPN is essentially based on fixedsize sliding windows (in a fully convolutional fashion) and thus avoids collapsing bins. RPN predicts small objects by using small anchors

It seems in their case for 1-class, RPN only was even better?

ilkarman avatar Mar 29 '18 15:03 ilkarman

The author is missing F.softmax for rpn scores. So the author used for ranking rpn is actually some raw numbers, not the expected scores

FortiLeiZhang avatar Aug 31 '18 01:08 FortiLeiZhang

As a two-stage model, Faster RCNN can train RPN and ROI individually using roi_cls_locs、roi_scores、rpn_locs, rpn_scores. But in your code,you only use the roi_cls_locs and roi_scores to train the ROI head,and not train the RPN.AM I right?

YangYangGirl avatar Jun 17 '19 19:06 YangYangGirl