faster-rcnn.pytorch
faster-rcnn.pytorch copied to clipboard
Problem with function bbox_overlaps_batch(anchors, gt_boxes)
Hello,
I'm studying this code. when I read the function at the script https://github.com/jwyang/faster-rcnn.pytorch/blob/master/lib/model/rpn/bbox_transform.py. The gt_boxes parameter has the size (b, K, 5). But, actually, the value of K is dynamic, maybe is zero. So, how did function solve when using batch-size for the gt_boxes parameter.
Thanks.

I understand the confusion there. It took me a while to figure it out as well, but I believe it works because the gt_boxes that are used as input are pre-processed in the roibatchLoader to assume fixed dimensions (through padding).
e.g. shape = [batch_size, K, 5]
where K = cfg.MAX_NUM_GT_BOXES, so it is listed in the config file.
The roibatchLoader-file might shed some more light on this for you:
https://github.com/jwyang/faster-rcnn.pytorch/blob/master/lib/roi_data_layer/roibatchLoader.py
@AlexanderHustinx Hi. So do you have any ideas on why the author force the num_gt_box to be 30 ? It seems that the gt box with x1y1x2y2 as (0, 0, 0, 0) is also used to compute rpn loss. Thanks.
Hey @chenchr,
It's been a while since I've looked through this code. But I can see they do pass num_boxes to the RPN too.
As they calculate the RPN anchor targets during training, also passing the argument num_boxes together with gt_boxes, I assume they only use the gt_boxes that aren't empty (i.e. [0,0,0,0,0]).
If I recall correctly, you can set the max number of gt boxes. I feel like the authors choose to use e.g. 30 as the max because of PASCAL and COCO it is very unlikely to see more than 30 objects in a single image.
Additionally, I believe they might be selecting the gt_boxes randomly from all the objects in the image, this way the order isn't fixed and if there are more than the max number of gt boxes in the image you might select different bboxes in the next epoch to reduce overfitting.
A little caveat: I haven't looked at this code much recently and don't recall everything entirely. So maybe not everything is 100% correct.
@AlexanderHustinx Thank you very much for your elaborate reply. I found that when computing box loss during rpn or rcnn, the groundtruth boxes with (0, 0, 0, 0) are masked out. So it should not affect the loss computation.
@chenchr I did not get why some overlaps where masked with 0 while oher are masked with -1. Can you explain why this is so?