MVSNet Questions about shape of depth_start and depth in get

Hi, Really appreciate your great work on both MVSNet and R-MVSNet!

When I was reading your code, the shape of depth_start, depth_interval, depth really confused me. To be specific, in the function train (located at train.py), you have extracted the DEPTH_MIN of the reference view in a batch-wise fashion. And it is quite obvious that the shape of depth_start is [FLAGS.batch_size] since you have explicitly reshaped it. When it comes to the get_homographies (located at homography_warping.py), depth = depth_start + tf.cast(tf.range(depth_num), tf.float32) * depth_interval. This is where I am stuck into and confused.

# train.py/train()
depth_start = tf.reshape(
tf.slice(cams, [0, 0, 1, 3, 0], [FLAGS.batch_size, 1, 1, 1, 1]),
[FLAGS.batch_size]
)
depth_interval = tf.reshape(
tf.slice(cams, [0, 0, 1, 3, 1], [FLAGS.batch_size, 1, 1, 1, 1]), 

[FLAGS.batch_size]
)

# homography_warping.py/get_homographies()
depth_num = tf.reshape(tf.cast(depth_num, 'int32'), [])
depth = depth_start + tf.cast(tf.range(depth_num), tf.float32) * depth_interval
num_depth = tf.shape(depth)[0]

For one thing, depth_interval's shape is [FLAGS.batch_size], but the tf.cast(tf.range(depth_num), tf.float32)'s shape is [depth_num], can these two tensors multiplied well? For another, depth's shape ought to be the same as depth_start. And in the next line of the code, num_depth = tf.shape(depth)[0], which is batch_size from my perspective. I am really confusing on how the shape of depth is formulated.

Currently, I am trying to rewrite your code in tensorpack and haven't tried to run them once. I am wondering if this is a issue related to the tf's version? I'm using tf 1.13

Thanks a lot!

Apr 19 '19 13:04 Yeeef

Hi, I am not sure whether it is a bug now. Actually I did not test the code with batch size > 1...

What I expect is that we should generate a 'depth' parameter of size [batchsize, depth_num]. For example, if depth_start = [400, 500, 600], depth_interval = [10, 15, 20] and depth_num = 4, depth should be:

[[400, 410, 420, 430], [500, 515, 530, 545], [600, 620, 640, 660]]

Maybe TF will do broadcasting when multiplying / adding vectors with different sizes. But anyway I will check and make the code more clear here...

Thanks

Apr 22 '19 13:04 YoYo000

Thanks for your reply!

Just as you say, when the batch_size=1, tensorflow will permit the addition and multiplication in

depth = depth_start + tf.cast(tf.range(depth_num), tf.float32) * depth_interval

As a consequence, the shape of depth is [depth_num] actually. And it will explain the code num_depth = tf.shape(depth)[0] as well, which will yield the value of depth_num.

But if the batch_size > 1, for one thing, the code cannot run well and for another, the num_depth would not be the value of depth_num. From your description, I recommend to modify the code as follows:

depth_num = tf.reshape(tf.cast(depth_num, 'int32'), [])
batch_size = tf.shape(depth_start)[0]
depth_start_mat = tf.tile(tf.reshape(depth_start, (batch_size, 1)), (1, depth_num))
depth_interval_mat = tf.tile(tf.reshape(depth_interval, (batch_size, 1)), (1, depth_num))
depth_range_mat = tf.tile(tf.reshape(tf.cast(tf.range(depth_num), tf.float32), [1, depth_num]), (batch_size, 1))
depth = depth_start_mat + depth_range_mat * depth_interval_mat
num_depth = tf.shape(depth)[1]

Again, thanks a lot for your reply!

Apr 23 '19 05:04 Yeeef

Questions about shape of depth_start and depth in get_homographies()