yolov3-tf2 icon indicating copy to clipboard operation
yolov3-tf2 copied to clipboard

I couldn't understand this piece of code.

Open ramsane opened this issue 4 years ago • 1 comments

        # 3. inverting the pred box equations
        grid_size = tf.shape(y_true)[1]
        grid = tf.meshgrid(tf.range(grid_size), tf.range(grid_size))
        grid = tf.expand_dims(tf.stack(grid, axis=-1), axis=2)
        true_xy = true_xy * tf.cast(grid_size, tf.float32) - \
            tf.cast(grid, tf.float32)
        true_wh = tf.math.log(true_wh / anchors)
        true_wh = tf.where(tf.math.is_inf(true_wh),
                           tf.zeros_like(true_wh), true_wh)

It is in the YoloLoss function. https://github.com/zzh8829/yolov3-tf2/blob/master/yolov3_tf2/models.py#L259

Can someone explain it to me what's happening?

I can understand what meshgrid and expand_dim are doing. We are creating 2D coordinates like [(0,1),(0,2),(0,3),.....(3,3)]. But I couldn't understand how and why we are calculating true_wh and true_xy.

ramsane avatar Jun 21 '20 09:06 ramsane

input xy contains the absolute coordinate of bbox scaled to [0, 1] this operation will translate it to xy coordinate of bbox relative to the grid it resides in

true_wh calculates the log size of the bbox. since log(0) = inf, we use tf.where to set inf back to zero

we use relative xy and log wh because the paper author determined this strategy had the best performance

zzh8829 avatar Jul 28 '20 19:07 zzh8829