PyTorch-YOLOv3
PyTorch-YOLOv3 copied to clipboard
How does the costume bbox/ annotation format look like? (Pixel coordinates or aspect ratio?)
Hi, Thanks for sharing this great repo. I am trying to create my own dataset with my own annotation. However, I do not know if the bbox coordinates have to be the aspect ratio or the real pixel coordinate of that object? As I read the paper, it looks like the bbox info contains the following information: {bx,by,bh,bw}, and the bx, by are bounded within 0~1, which represents the ratio of where the center is with respect to the assigned grid.
Could you please let me know if I should annotate my bbox information according to this aspect ratio fashion or just the real- pixel coordinate of it?
Thanks!!
if you have the real-pixel coordinates, just divide the x_center and width by the width of your image and the y_center and height by the height of your images to normalize.
if you have the real-pixel coordinates, just divide the x_center and width by the width of your image and the y_center and height by the height of your images to normalize.
Thanks for asking. I just went through the codes, and I now I have one other question. In the Dataset, why do we need to reshape our label into the shape of [50, 5] (suppose my original label is just [45 0.479492 0.688771 0.955609 0.595500 ]).
And then, I am having a hard time to understand how could this [50,5] label match up with the network's output? From the paper, the output could be 13 * 13 * 85, which means that we have 13 grids and 85 predictions.
Thanks in advance!
Is this issue still relevant?