AHIQ icon indicating copy to clipboard operation
AHIQ copied to clipboard

deform_fusion

Open xiaoyudanaa opened this issue 2 years ago • 1 comments

Hello, I also use the feature processing of cnn and transformer in my network. Is it also possible to do fusion with your deform_fusion module? If possible, you set "self.conv_offset = nn.Conv2d(in_channels, 233, 3, 1, 1)" in deform_fusion, I don't quite understand the meaning of 233. There are also input and output channels as "in_channels=7685, cnn_channels=2563, out_channels=256*3", why are you multiplying by 5 and 3 respectively.

xiaoyudanaa avatar May 21 '22 09:05 xiaoyudanaa

While in deform_fusion, self.conv_offset is used to calculate the offset for the deformable convolution. For each pixel in the input feature map, we calculate 2*3*3 offsets, in which 2 means the x and y offsets, 3*3 mean the size of the convolution kernel. In short, the output channels mean the x and y offset for a 3*3 convolution kernel. For more details, please refer to the link: https://pytorch.org/vision/main/generated/torchvision.ops.deform_conv2d.html. And for the multiply parameters, it is because we extracted features from three layers of CNN and five layers of Transformer.

yuanygong avatar May 26 '22 03:05 yuanygong