pytorch-YOLO-v1 icon indicating copy to clipboard operation
pytorch-YOLO-v1 copied to clipboard

The network predicts absolute value of xy instead of offset to the grid cell as specified in paper. Why is it so?

Open meet-minimalist opened this issue 6 years ago • 3 comments

meet-minimalist avatar Apr 28 '19 06:04 meet-minimalist

Refer to the code, and I find the network predicts the values of center pixel and the width and height, which is not as what you say.

Jinming-Su avatar May 26 '19 03:05 Jinming-Su

@Jinming-Su I think you are correct. He has encoded the vector of 7x7x30 in dataset.py line no. 20 to 25 as offset of grid cell.

meet-minimalist avatar May 28 '19 09:05 meet-minimalist

Refer to the predict.py the function of decoder reveals the network predicts the offsets to the grid cell , while in the loss.py we can find that box1_xyxy and the box2_xyxy they are not the truely center_x and center_y for the image size, they are the offsets to the specific grid cell, and they need to add the location of the grid cell, but here we want to compute the iou for the predicted box and the target box,so we can treat them as a shift transformation of the center_x and center_y and the final iou won't change.

jasonwjw avatar Jun 23 '20 11:06 jasonwjw