TF-deformable-conv
TF-deformable-conv copied to clipboard
How to convert the mxnet code to your code
This is not an issue because the code worked fine in TF 1.2 and cudnn 5.1 In this question, I want to ask about how can I convert the mxnet code using your implementation. As shown in the line 678, we have
res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)
res5a_branch2b_offset = mx.symbol.Convolution(name='res5a_branch2b_offset', data = res5a_branch2a_relu, num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1),weight=res5a_branch2b_offset_weight, bias=res5a_branch2b_offset_bias)
res5a_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5a_branch2b', data=res5a_branch2a_relu, offset=res5a_branch2b_offset,num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1, stride=(1, 1), dilate=(2, 2), no_bias=True)
How can I convert 4 above lines using deform_conv_op.deform_conv_op
? I read the demo.py
, test_deform_conv.py
, and this is my current coverting
import tensorflow.contrib.layers as ly
from lib.deform_conv_op import deform_conv_op
res5a_branch2b_offset = ly.conv2d(res5a_branch2a_relu, num_outputs=18, kernel_size=3, stride=2, activation_fn=None, data_format='NHWC')
num_x = res5a_branch2a_relu.shape[self.channel_axis].value
res5a_branch2b_kernel= tf.get_variable('weights', shape=[3, 3, num_x, 512])
res5a_branch2b = deform_conv_op(res5a_branch2a_relu, filter=o_b2b_kernel, offset=o_b2b_offset,
rates=[1, 2, 2, 1], padding="SAME", strides=[1, 1, 1, 1],
num_groups=1, deformable_group=1, name='%s/bottleneck_v1/conv2' % name)
Note that, above converting used NHWC order and still missing two first lines
res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)
And also got the error
ValueError: Deformconv requires the offset compatible with filter, but got: [4,64,64,18] for 'resnet_v1_101/block4/unit_1/bottleneck_v1/conv2' (op: 'DeformConvOp') with input shapes: [4,64,64,512], [3,3,512,512], [4,64,64,18].
I think the problem is shape order. It needs order as NxCxHxW, instead of NxHxWxC in tensorflow. After obtained the result of deformable convolution using NxCxHxW, I need to reorder again to NxHxWxC as tensorflow format. Am I right?
It seems deform_conv_op.deform_conv_op
assume every input to be NCHW
order, you could refer to the faster-rcnn version deformable convolution over here. By the way, the script test_deform_conv.py shows a sample call to deform_conv_op
which involves full parameter list and actual shape, hope it could help you understand how this op works.
Thanks for your help. I checked the code that confirmed only support NCHW format. Sorry, I did not read it before.
For the second question, do you think is it necessary to initialize the weight with lr_mult=1.0, and lr_mult=2.0
. I did not find your code like that
res5b_branch2b_offset_weight = mx.symbol.Variable('res5b_branch2b_offset_weight', lr_mult=1.0)
res5b_branch2b_offset_bias = mx.symbol.Variable('res5b_branch2b_offset_bias', lr_mult=2.0)
@John1231983 I am not familiar with the mxnet context, what does the lr_mult
represent? learning rate multiplier? Or linear decay coefficient?
I guess this is learning rate multiplier for bias and weight.
@John1231983 I don't think I did. If you are referring to the code of tf_deform_net
, you could either dirty hack it in here, or find if there is any flag to create var in tensorflow and mend it here.
Thanks to your great direction. I have completed it and it worked well. However, I have one question for offset. In traditional implementation, the offset did not include rate/dilated parameter although the deformable convolution has it. For example, the you can see it in the deeplab
res5c_branch2b_offset = mx.symbol.Convolution(name='res5c_branch2b_offset', data = res5c_branch2a_relu, num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1), weight=res5c_branch2b_offset_weight, bias=res5c_branch2b_offset_bias)
res5c_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5c_branch2b', data=res5c_branch2a_relu, offset=res5c_branch2b_offset,num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1,stride=(1, 1), dilate=(2, 2), no_bias=True)
However, in your implementation for Faster RCNN I found
conv(3, 3, 72, 1, 1, biased=True, rate=2, relu=False, name='res5a_branch2b_offset', padding='SAME', initializer='zeros'))
(self.feed('res5a_branch2a_relu', 'res5a_branch2b_offset')
.deform_conv(3, 3, 512, 1, 1, biased=False, rate=2, relu=False, num_deform_group=4, name='res5a_branch2b')
It shows that the offset needs to include rate to make it consistent with its deformable convolution. Do we need to consider rate in the offset?
@John1231983 I am not sure whether the rate in the offset stream is necessary. I remember I set all arguments according to the original implementation, so I did not have a particular reason for the rate stuff. You might very well to set it to 1
and see how it performs.
I think it is necessary. I guess the author is missing because each element of convolution will correspond to offset element. On other hands, the offset element controls the direction of convolution in deformation. When we use the dilation way, the convolution element already sparse and the offset element also needs to sparse to corporate together location by location