SparseR-CNN icon indicating copy to clipboard operation
SparseR-CNN copied to clipboard

About the loss format

Open FL77N opened this issue 4 years ago • 4 comments

Hi, thanks for your great work!I have a problem that I see the format of the box is (cx, cy, w, h) in the paper, when computing loss. But I find the difference in your code which is (x, y, x, y)? eg:

#   The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.

assert 'pred_boxes' in outputs
idx = self._get_src_permutation_idx(indices)
src_boxes = outputs['pred_boxes'][idx]
target_boxes = torch.cat([t['boxes_xyxy'][i] for t, (_, i) in zip(targets, indices)], dim=0)

losses = {}
loss_giou = 1 - torch.diag(box_ops.generalized_box_iou(src_boxes, target_boxes))
losses['loss_giou'] = loss_giou.sum() / num_boxes

image_size = torch.cat([v["image_size_xyxy_tgt"] for v in targets])
src_boxes_ = src_boxes / image_size
target_boxes_ = target_boxes / image_size

loss_bbox = F.l1_loss(src_boxes_, target_boxes_, reduction='none')
losses['loss_bbox'] = loss_bbox.sum() / num_boxes

return losses

FL77N avatar Apr 27 '21 09:04 FL77N

Hi~ Actually (cx, cy, w, h) and (x1, y1, x2, y2) can be converted by box_ops.box_cxcywh_to_xyxy and box_ops.box_xyxy_to_cxcywh.

PeizeSun avatar Apr 27 '21 09:04 PeizeSun

y (cx, cy, w, h) and (x1, y1, x2, y2) can b

Em~thank you, I mean I should use the format (x1, y1, x2, y2) to compute l1 loss as your code instead of (cx, cy, w, h)?

FL77N avatar Apr 27 '21 09:04 FL77N

yep~

PeizeSun avatar Apr 27 '21 09:04 PeizeSun

yep~

thanks

FL77N avatar Apr 27 '21 09:04 FL77N