pytorch-deform-conv icon indicating copy to clipboard operation
pytorch-deform-conv copied to clipboard

Confusion about the shape of offset

Open ChiWeiHsiao opened this issue 7 years ago • 7 comments

Hello,

I am confused about the shape of offset. The paper mentions: "The grid R defines the receptive field size and dilation. For example,R = {(−1,−1),(−1,0),...,(0,1),(1,1)}. In deformable convolution, the regular grid R is aug- mented with offsets {∆pn |n = 1, ..., N }, where N = |R|. The output offset fields have the same spatial resolution with the input feature map. The channel dimension 2N corresponds to N 2D offsets." So, I think the shape of offset field would be [29, H, W] if 3x3 kernel is used. While in your implementation, the shape of offset seems to be [batch_size, 2n_channels, H, W]?

ChiWeiHsiao avatar Sep 25 '18 11:09 ChiWeiHsiao

I think u are right, in official repository (https://github.com/msracver/Deformable-ConvNets), the number of channel in offset is 2N=29=18 for 3x3 kernel.

zhangbin0917 avatar Oct 12 '18 12:10 zhangbin0917

I think u are right, in official repository (https://github.com/msracver/Deformable-ConvNets), the number of channel in offset is 2_N=2_9=18 for 3x3 kernel.

Do you have fixed this bug ??

sxzy avatar Oct 30 '18 08:10 sxzy

I have the same confusion here. For every point of output, there should be a offset grid of 2x9 points. So I think the number of channels should be 2N, instead of n_channels.

eezywu avatar Jan 19 '19 02:01 eezywu

I believe its a bug. The tensorflow implementation the author follows https://github.com/kastnerkyle/deform-conv also has the bug.

cdowen avatar Apr 01 '19 12:04 cdowen

hey @cdowen and @Seashell_9 , thanks for looking into this, I am not working on this at the moment. If you find a potential bug, could you please try to submit a PR to fix it? On Mon 1 Apr 2019 at 14:45, cdowen [email protected] wrote:

I believe its a bug. The tensorflow implementation the author follows https://github.com/kastnerkyle/deform-conv also have the bug.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/oeway/pytorch-deform-conv/issues/17#issuecomment-478564147, or mute the thread https://github.com/notifications/unsubscribe-auth/AAdNy6nnHVmxeEWv5yiUxyUiACZ174Ffks5vcf9xgaJpZM4W4Yqy .

oeway avatar Apr 01 '19 15:04 oeway

I don't know if my understanding is correct... The kernel offsets should be different for each output pixel, thus it is impossible to generate a deformed feature map and apply convolution on it later. Rather, there should be different kernel offset for different output pixels. You can check https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch for a pytorch implementation using cuda. Perhaps it would be too slow to implement it in pure python. Any ideas?

cdowen avatar Apr 04 '19 12:04 cdowen

Instead of learning the offsets on the kernels, this code implements the deformable convolution by learning the offsets on the feature map. For example, by using x = self.offset12(x), we can get a feature map augmented with offsets. Then by using the self.conv12(x) (here self.conv12 has 3x3 kernel), the convolutional operation is implemented on the 3x3 neighbors and each neighbor has already be augmented with the offset.

Hello,

I am confused about the shape of offset. The paper mentions: "The grid R defines the receptive field size and dilation. For example,R = {(−1,−1),(−1,0),...,(0,1),(1,1)}. In deformable convolution, the regular grid R is aug- mented with offsets {∆pn |n = 1, ..., N }, where N = |R|. The output offset fields have the same spatial resolution with the input feature map. The channel dimension 2N corresponds to N 2D offsets." So, I think the shape of offset field would be [2_9, H, W] if 3x3 kernel is used. While in your implementation, the shape of offset seems to be [batch_size, 2_n_channels, H, W]?

TiantianWang avatar Apr 29 '21 01:04 TiantianWang