R2Plus1D-PyTorch icon indicating copy to clipboard operation
R2Plus1D-PyTorch copied to clipboard

What is performance in comparison with original implementation?

Open John1231983 opened this issue 7 years ago • 4 comments

Great implementation. Could you provide the reproduce result that can use to compare with original implementation in CAFFE2? Thanks

John1231983 avatar Jun 19 '18 17:06 John1231983

I think the first conv should be conv2d. Am I right? The correct version likes

       self.spatial_conv = nn.Conv2d(in_channels, intermed_channels, kernel=3,
                                    stride=1, padding=1, bias=bias)
        self.bn = nn.BatchNorm2d(intermed_channels)
        self.relu = nn.ReLU()
        self.temporal_conv = nn.Conv3d(intermed_channels, out_channels, temporal_kernel_size, 
                                    stride=temporal_stride, padding=temporal_padding, bias=bias)

John1231983 avatar Jun 19 '18 21:06 John1231983

I think it is okay. It should be kept as conv3d. but it actually performs like conv2d because one of kernel size is 1.

yechanp avatar Oct 13 '19 08:10 yechanp

self.conv3 = SpatioTemporalResLayer(64, 128, 3, layer_sizes[1], block_type=block_type, downsample=True) why downsample=True?input size = 64 output size =128,I can't understand.can you help me ? Thanks! @irhum

JinXiaozhao avatar Dec 03 '19 15:12 JinXiaozhao

My finding is that it's actually slower than C3D with fp16. With fp32, R2+1D is faster.

pytorch 1.3 cuda 10.2 cudnn 7.6.5

I think the newer cudnn is quite efficient in performing 3D convolution for fp16 inputs.

Litou1 avatar Dec 06 '19 20:12 Litou1