vision icon indicating copy to clipboard operation
vision copied to clipboard

Ground-truth bounding boxes included in RPN proposals?

Open florisdf opened this issue 2 years ago • 3 comments

Hi,

I am training a Faster-RCNN-like architecture and wanted to log the detection losses on the validation dataset. To obtain these losses, I passed in the targets of the validation data to the forward() method of GeneralizedRCNN. When investigating the results, however, I noticed that each ground-truth bounding box had a near-exact match with a predicted bounding box. When passing targets=None, these bounding boxes were gone. Is this the intended behavior? If so, what is the rationale behind it?

I figured that this line is where the predicted and ground-truth boxes get mixed:

https://github.com/pytorch/vision/blob/095cabb76cdcd4763bad629481d189b91e3df42c/torchvision/models/detection/roi_heads.py#L646

It seems to have something to do with balancing positive/negative sampling. Yet, I don't see how a model would improve from boxes that it did not generate itself.

florisdf avatar Apr 04 '22 18:04 florisdf

Facing the same issue. Using this line during test time gives an mAP of ~100%, however including this during train time and not during test results in poor convergence (unable to get mAP even up to 10% ). Should we be returning the original proposals during train time as well?

java-abhinav07 avatar Aug 02 '22 06:08 java-abhinav07

Facing the same issue. Using this line during test time gives an mAP of ~100%, however including this during train time and not during test results in poor convergence (unable to get mAP even up to 10% ). Should we be returning the original proposals during train time as well?

Having done some experiments, it turns out that the training performance is not impacted by the add_gt_proposals method and everything works fine if one doesn't use it during inference. Performance was largely impacted by the use of pretrained RPN, FPN.

java-abhinav07 avatar Aug 03 '22 10:08 java-abhinav07

See https://github.com/facebookresearch/maskrcnn-benchmark/issues/570#issuecomment-473218934. Basically, it's a hack to train the classifier, without the box regressor obtaining the gradients.

crazyboy9103 avatar Jan 28 '24 05:01 crazyboy9103