Machine-Learning-Collection YOLO ground truth width and length are not relative to image size but to S

YOLO ground truth width and length are not relative to image size but to S

Open oonisim opened this issue 2 years ago • 0 comments

Code

dataset.py calculate thewidth_cell and height_cell to be set to the label_matrix Tensor.

"""
...
Then to find the width relative to the cell is simply:
width_pixels/cell_pixels, simplification leads to the
formulas below.
"""
width_cell, height_cell = (
    width * self.S,
    height * self.S,
)

Question

Please help understand why the unit of width_cell and width_cell are cells, that is, relative to S instead of image size.

In my understanding, width andheight are from the YOLO Darknet annotation where width and height are relative to the image size whose value is between 0 and 1. Suppose width=0.7, then width_cell will be 4.9 cells.

If width_cell and width_cell are used as the ground truth for YOLO v1 training, I suppose they should be relative to image size as in the YOLO v1 paper.

Each bounding box consists of 5 predictions: x, y, w, h, and confidence. The (x; y) coordinates represent the center of the box relative to the bounds of the grid cell. The width and height are predicted relative to the whole image.

Feb 26 '23 07:02 oonisim

Machine-Learning-Collection Machine-Learning-Collection copied to clipboard

YOLO ground truth width and length are not relative to image size but to S

Code

Question

Machine-Learning-Collection
Machine-Learning-Collection copied to clipboard