yolov3-tf2
yolov3-tf2 copied to clipboard
I couldn't understand this piece of code.
# 3. inverting the pred box equations
grid_size = tf.shape(y_true)[1]
grid = tf.meshgrid(tf.range(grid_size), tf.range(grid_size))
grid = tf.expand_dims(tf.stack(grid, axis=-1), axis=2)
true_xy = true_xy * tf.cast(grid_size, tf.float32) - \
tf.cast(grid, tf.float32)
true_wh = tf.math.log(true_wh / anchors)
true_wh = tf.where(tf.math.is_inf(true_wh),
tf.zeros_like(true_wh), true_wh)
It is in the YoloLoss function. https://github.com/zzh8829/yolov3-tf2/blob/master/yolov3_tf2/models.py#L259
Can someone explain it to me what's happening?
I can understand what meshgrid and expand_dim are doing. We are creating 2D coordinates like [(0,1),(0,2),(0,3),.....(3,3)]. But I couldn't understand how and why we are calculating true_wh and true_xy.
input xy contains the absolute coordinate of bbox scaled to [0, 1]
this operation will translate it to xy coordinate of bbox relative to the grid it resides in
true_wh calculates the log size of the bbox. since log(0) = inf
, we use tf.where to set inf
back to zero
we use relative xy and log wh because the paper author determined this strategy had the best performance