faster_rcnn_pytorch icon indicating copy to clipboard operation
faster_rcnn_pytorch copied to clipboard

ROI pooling layer only supports the case that batch size equals to 1?

Open zfchenUnique opened this issue 6 years ago • 5 comments

From the source code(roi_pooling_cuda.c) and my naive experiments, it seems that the RoI pooling layer only support batch size equals to one. Does anyone know why?

zfchenUnique avatar Oct 04 '17 02:10 zfchenUnique

It supports batch_size > 1. You can comment the if statement in roi_pooling_cuda.c and rebuild it.

longcw avatar Oct 04 '17 02:10 longcw

@longcw @JeffCHEN2017 does this project support batch_size larger than 1?

brisker avatar Oct 09 '17 02:10 brisker

Yes, it supports as long as you comment the codes in row_pooling_cuda.c. Please read the code for details. And I have been using it for a while and so far so good.

zfchenUnique avatar Nov 05 '17 05:11 zfchenUnique

@JeffCHEN2017 , @longcw, can you please elaborate how you managed to train with multiple batch size? As you guys suggested, I re-ran roi_pooling_cuda.c by commenting out the relevant lines. Then, I changed IMS_PER_BATCH: 4 in experiments/cfgs/faster_rcnn_end2end.yml. When I start to train I got the following:

File "train.py", line 115, in <module>
    blobs = data_layer.forward()
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/roi_data_layer/layer.py", line 74, in forward
    blobs = self._get_next_minibatch()
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/roi_data_layer/layer.py", line 70, in _get_next_minibatch
    return get_minibatch(minibatch_db, self._num_classes)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/roi_data_layer/minibatch.py", line 39, in get_minibatch
    assert len(im_scales) == 1, "Single batch only"
AssertionError: Single batch only

Then, I commented out relevant assert lines, and finally got another error:

File "train.py", line 123, in <module>
    net(im_data, im_info, gt_boxes, gt_ishard, dontcare_areas)
  File "/home/sam/.virtualenvs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/faster_rcnn.py", line 215, in forward
    features, rois = self.rpn(im_data, im_info, gt_boxes, gt_ishard, dontcare_areas)
  File "/home/sam/.virtualenvs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/faster_rcnn.py", line 71, in forward
    cfg_key, self._feat_stride, self.anchor_scales)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/faster_rcnn.py", line 122, in proposal_layer
    x = proposal_layer_py(rpn_cls_prob_reshape, rpn_bbox_pred, im_info, cfg_key, _feat_stride, anchor_scales)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/rpn_msr/proposal_layer.py", line 131, in proposal_layer
    proposals = bbox_transform_inv(anchors, bbox_deltas)
  File "/home/sam/Projects/detection/frcnn.pytorch/faster_rcnn/fast_rcnn/bbox_transform.py", line 59, in bbox_transform_inv
    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
ValueError: operands could not be broadcast together with shapes (74592,1) (18648,1) 

Seems like it can't load boxes? Have you experiences anything likes this? If so, how did you manage to solve it?

samet-akcay avatar Nov 15 '17 12:11 samet-akcay

FWIW, there is also a subtle bug in the cuda backwards code for roi pooling that manifests itself only when batch size is > 1:

This line https://github.com/longcw/faster_rcnn_pytorch/blob/4fda7a4b89cf71fc3905bd484b1dc82dbc6150d1/faster_rcnn/roi_pooling/src/cuda/roi_pooling_kernel.cu#L170 should end with == (c * height + h) * width + w instead of == index, or else gradients will be propagated only into the first element of the batch. Discovered this issue while using roi pooling layer for another project.

lopuhin avatar Dec 12 '17 19:12 lopuhin