FaceBoxes-tensorflow
FaceBoxes-tensorflow copied to clipboard
Understanding ops under reshaping scope
Under reshaping scope:
https://github.com/TropComplique/FaceBoxes-tensorflow/blob/545ec4f4f3c55c3592ee189ed56a11a3fd017194/src/detector.py#L286
Why is 2 reshapes is needed here:
https://github.com/TropComplique/FaceBoxes-tensorflow/blob/545ec4f4f3c55c3592ee189ed56a11a3fd017194/src/detector.py#L297
And why list is stacked via tf.stack?
y = tf.reshape(y, tf.stack([batch_size, height_i, width_i, num_predictions_per_location, 4]))
And here: https://github.com/TropComplique/FaceBoxes-tensorflow/blob/545ec4f4f3c55c3592ee189ed56a11a3fd017194/src/detector.py#L301
Order of with tf.stack and without is different.
It looks like it can be done in single step, tested in numpy:
box_encoding_0, box_encoding_1, box_encoding_2, class_0, class_1, class_2 = outputs
batch_size = 1
num_predictions_per_location = [21, 1, 1]
# V1 :
print('box_encoding_0.shape', box_encoding_0.shape)
_, height_i, width_i, _ = box_encoding_0.shape
box_encoding_0_1 = box_encoding_0.reshape(batch_size, height_i, width_i, num_predictions_per_location[0], 4)
print('box_encoding_0_1.shape', box_encoding_0_1.shape)
num_anchors_on_feature_map = height_i * width_i * num_predictions_per_location[0]
box_encoding_0_1 = box_encoding_0_1.reshape(batch_size, num_anchors_on_feature_map, 4)
print('box_encoding_0_1.shape', box_encoding_0_1.shape)
# V2 :
box_encoding_0_2 = box_encoding_0.reshape(batch_size, num_anchors_on_feature_map, 4)
print('box_encoding_0_2.shape', box_encoding_0_2.shape)
print('diff', np.max(np.abs(box_encoding_0_1-box_encoding_0_2)))