TF-deformable-conv How to convert the mxnet code to your code

This is not an issue because the code worked fine in TF 1.2 and cudnn 5.1 In this question, I want to ask about how can I convert the mxnet code using your implementation. As shown in the line 678, we have

res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)
res5a_branch2b_offset = mx.symbol.Convolution(name='res5a_branch2b_offset', data = res5a_branch2a_relu,  num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1),weight=res5a_branch2b_offset_weight, bias=res5a_branch2b_offset_bias)
res5a_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5a_branch2b', data=res5a_branch2a_relu, offset=res5a_branch2b_offset,num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1, stride=(1, 1), dilate=(2, 2), no_bias=True)

How can I convert 4 above lines using deform_conv_op.deform_conv_op? I read the demo.py, test_deform_conv.py, and this is my current coverting

import tensorflow.contrib.layers as ly
from lib.deform_conv_op import deform_conv_op
res5a_branch2b_offset = ly.conv2d(res5a_branch2a_relu, num_outputs=18, kernel_size=3,  stride=2, activation_fn=None, data_format='NHWC')
num_x = res5a_branch2a_relu.shape[self.channel_axis].value
res5a_branch2b_kernel= tf.get_variable('weights', shape=[3, 3, num_x, 512])
res5a_branch2b = deform_conv_op(res5a_branch2a_relu, filter=o_b2b_kernel, offset=o_b2b_offset,
                                        rates=[1, 2, 2, 1], padding="SAME", strides=[1, 1, 1, 1],
                                        num_groups=1, deformable_group=1, name='%s/bottleneck_v1/conv2' % name)

Note that, above converting used NHWC order and still missing two first lines

res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)

And also got the error

ValueError: Deformconv requires the offset compatible with filter, but got: [4,64,64,18] for 'resnet_v1_101/block4/unit_1/bottleneck_v1/conv2' (op: 'DeformConvOp') with input shapes: [4,64,64,512], [3,3,512,512], [4,64,64,18].

Jan 02 '18 07:01 John1231983

I think the problem is shape order. It needs order as NxCxHxW, instead of NxHxWxC in tensorflow. After obtained the result of deformable convolution using NxCxHxW, I need to reorder again to NxHxWxC as tensorflow format. Am I right?

Jan 02 '18 13:01 John1231983

It seems deform_conv_op.deform_conv_op assume every input to be NCHW order, you could refer to the faster-rcnn version deformable convolution over here. By the way, the script test_deform_conv.py shows a sample call to deform_conv_op which involves full parameter list and actual shape, hope it could help you understand how this op works.

Jan 02 '18 15:01 Zardinality

Thanks for your help. I checked the code that confirmed only support NCHW format. Sorry, I did not read it before. For the second question, do you think is it necessary to initialize the weight with lr_mult=1.0, and lr_mult=2.0. I did not find your code like that

res5b_branch2b_offset_weight = mx.symbol.Variable('res5b_branch2b_offset_weight', lr_mult=1.0)
res5b_branch2b_offset_bias = mx.symbol.Variable('res5b_branch2b_offset_bias', lr_mult=2.0)

Jan 03 '18 06:01 John1231983

@John1231983 I am not familiar with the mxnet context, what does the lr_mult represent? learning rate multiplier? Or linear decay coefficient?

Jan 03 '18 07:01 Zardinality

I guess this is learning rate multiplier for bias and weight.

Jan 03 '18 07:01 John1231983

@John1231983 I don't think I did. If you are referring to the code of tf_deform_net, you could either dirty hack it in here, or find if there is any flag to create var in tensorflow and mend it here.

Jan 03 '18 08:01 Zardinality

Thanks to your great direction. I have completed it and it worked well. However, I have one question for offset. In traditional implementation, the offset did not include rate/dilated parameter although the deformable convolution has it. For example, the you can see it in the deeplab


res5c_branch2b_offset = mx.symbol.Convolution(name='res5c_branch2b_offset', data = res5c_branch2a_relu, num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1), weight=res5c_branch2b_offset_weight, bias=res5c_branch2b_offset_bias)
res5c_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5c_branch2b', data=res5c_branch2a_relu, offset=res5c_branch2b_offset,num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1,stride=(1, 1), dilate=(2, 2), no_bias=True)

However, in your implementation for Faster RCNN I found

conv(3, 3, 72, 1, 1, biased=True, rate=2, relu=False, name='res5a_branch2b_offset', padding='SAME', initializer='zeros'))
 (self.feed('res5a_branch2a_relu', 'res5a_branch2b_offset')
 .deform_conv(3, 3, 512, 1, 1, biased=False, rate=2, relu=False, num_deform_group=4, name='res5a_branch2b')

It shows that the offset needs to include rate to make it consistent with its deformable convolution. Do we need to consider rate in the offset?

Jan 10 '18 03:01 John1231983

@John1231983 I am not sure whether the rate in the offset stream is necessary. I remember I set all arguments according to the original implementation, so I did not have a particular reason for the rate stuff. You might very well to set it to 1 and see how it performs.

Jan 10 '18 05:01 Zardinality

I think it is necessary. I guess the author is missing because each element of convolution will correspond to offset element. On other hands, the offset element controls the direction of convolution in deformation. When we use the dilation way, the convolution element already sparse and the offset element also needs to sparse to corporate together location by location

Jan 10 '18 06:01 John1231983

TF-deformable-conv TF-deformable-conv copied to clipboard

How to convert the mxnet code to your code

TF-deformable-conv
TF-deformable-conv copied to clipboard