pytorch-LiLaNet
pytorch-LiLaNet copied to clipboard
[Question] Usage of padding
Thanks for providing an implementation for this architecture. I'm currently implementing a variant of it, and wondered why a padding is chosen in the conv layers although it is not mentioned in the LiLaNet paper (at least I couldn't find it):
https://github.com/TheCodez/pytorch-LiLaNet/blob/f68aae9d23a18c1d5b255fc0d4e7b421f2da44fd/lilanet/model/lilanet.py#L78-L81
I think it might make sense that a padding is applied along the axis where the kernel size is 7, so that the spatial dimensions decrease in the same way as for the side that has a kernel size of 3. But it isn't mentioned in the paper, or am I missing something?
Also, why is a padding of 1 applied in the 1x1 convolution?
Bonus question: So both spatial dimensions are decreased by one after each LiLaBlock. Why not use an (additional) padding of 1 so that the size is preserved through the network?
I'm just a beginner in deep learning so any help is appreciated.
Okay so I realized that the equal decreasing along both spatial dimensions is actually required so that all of the four parallel convolutions result in tensors of the same spatial dimensions, which is required for the LiLaNet because of the subsequent stacking.
I also realized that with the current choice of padding the spatial dimensions are indeed preserved through the network. However, this is caused by the 1-padding in the 1x1 convolution. This seems a bit strange to me, because on the edges the convolutions can only consider the zero-padding, because they are only 1x1. What about removing this padding and instead increasing the padding for the previous conv layers, where the kernels are larger?