pointnet.pytorch Conv1d kernel size in transformer nets

Comparing your code to the official TensorFlow implementation I believe the kernel size ought to be 3 for the conv1 in the transformer network code (starting here).

The official implementation convolves 64 1x3 filters over each of N 1x3 points. The result is 64 scalar values describing each point (i.e. a N x 64 matrix). Your code uses a filter size of 1.

Perhaps you could clarify?

Feb 05 '18 19:02 meder411

Also, as a micro-optimization, instead of performing 2 transpose operations, you could always just left-multiply by the transform and then you need no transpose operations. The network just ends up learning the inverse transform than we envision (i.e. the transpose).

Feb 05 '18 19:02 meder411

@meder411 I think the difference between pytorch code and the official tf code is due to the dimension arrangement difference between pytorch and tensorflow. In pytorch, N, C, H, W, however in tf is N, H, W, C. However, I have another question about conv layers. In pytorch, Linear(Fc) layer can have arbitrary intermediate dimensions, so, why don't we just use fc layers to implement pointnet but the auther still use conv layers to implement it, although they leads to the same result.

Sep 19 '18 13:09 zeal-up

@zeal-github ,in my opinion,using fc layers means input is 2 dimensions.But there is no difference between fc layers and conv layers with 1x1 kernel indeed except dimensions.

Jul 17 '19 08:07 tangeroo

@meder411 I think the difference between pytorch code and the official tf code is due to the dimension arrangement difference between pytorch and tensorflow. In pytorch, N, C, H, W, however in tf is N, H, W, C. However, I have another question about conv layers. In pytorch, Linear(Fc) layer can have arbitrary intermediate dimensions, so, why don't we just use fc layers to implement pointnet but the auther still use conv layers to implement it, although they leads to the same result.

I think you forget one important setting from the figure of network architecture. Here the weights are shared among perceptrons. So in this case using nn.Conv1d is a better option compared to nn.Linear().

Screenshot from 2019-08-14 15-59-17

Aug 14 '19 08:08 tranvnhan

Why is it Conv1D. As far as I can see in the official tensorflow implementation, the author has used only Conv2D layers, am I missing something? Could someone point out where the author has used 1D convolutions? Thanks!

Apr 12 '20 06:04 ShrutheeshIR

@meder411 , have you figured out why? I am also confused about this. The TF implementation 2D Conv with 1x3 kernel size and 64 output channels while the PyTorch implementation is 1D Conv with 1x1 kernel size and 64 output channels.

May 07 '20 01:05 timothylimyl

Hi @fxia22,

would you like to help us out, like @ShrutheeshIR and @timothylimyl , I am also wondering why Conv1d is used when the original code uses Conv2d to perform the convolutions.

May 13 '20 11:05 dhirajsuvarna

@ShrutheeshIR @dhirajsuvarna ,

I think it ends up to be the same. I checked the summary for the parameters from 1D Conv (Pytorch) and you can see that the first 1D conv has 64*3 + 64 parameters which are the same as 2D Conv from TF.

summary

May 17 '20 09:05 timothylimyl

Hello @timothylimyl Thanks. Yes, I noticed the tensorflow implementation, and it seems like they add a dimension to the input, to apply the 2D convolution, which is equivalent to the 1D convolution here without the added dimension.

May 17 '20 09:05 ShrutheeshIR

As much as I know, the tensorflow implementation uses conv2d in order to leverage the optimization provided by cudnn, which is not available for conv1d.

Jul 22 '21 00:07 crisz

pointnet.pytorch pointnet.pytorch copied to clipboard

Conv1d kernel size in transformer nets

pointnet.pytorch
pointnet.pytorch copied to clipboard