vision
vision copied to clipboard
Ground-truth bounding boxes included in RPN proposals?
Hi,
I am training a Faster-RCNN-like architecture and wanted to log the detection losses on the validation dataset. To obtain these losses, I passed in the targets
of the validation data to the forward()
method of GeneralizedRCNN
. When investigating the results, however, I noticed that each ground-truth bounding box had a near-exact match with a predicted bounding box. When passing targets=None
, these bounding boxes were gone. Is this the intended behavior? If so, what is the rationale behind it?
I figured that this line is where the predicted and ground-truth boxes get mixed:
https://github.com/pytorch/vision/blob/095cabb76cdcd4763bad629481d189b91e3df42c/torchvision/models/detection/roi_heads.py#L646
It seems to have something to do with balancing positive/negative sampling. Yet, I don't see how a model would improve from boxes that it did not generate itself.
Facing the same issue. Using this line during test time gives an mAP of ~100%, however including this during train time and not during test results in poor convergence (unable to get mAP even up to 10% ). Should we be returning the original proposals during train time as well?
Facing the same issue. Using this line during test time gives an mAP of ~100%, however including this during train time and not during test results in poor convergence (unable to get mAP even up to 10% ). Should we be returning the original proposals during train time as well?
Having done some experiments, it turns out that the training performance is not impacted by the add_gt_proposals
method and everything works fine if one doesn't use it during inference. Performance was largely impacted by the use of pretrained RPN, FPN.
See https://github.com/facebookresearch/maskrcnn-benchmark/issues/570#issuecomment-473218934. Basically, it's a hack to train the classifier, without the box regressor obtaining the gradients.