lambda-networks icon indicating copy to clipboard operation
lambda-networks copied to clipboard

Question on experiences with Efficiency

Open MarcCoru opened this issue 4 years ago • 2 comments

Hi @lucidrains, Thanks a lot for providing this implementation so quickly. I have a question regarding your (or other's) experience on the efficiency of Lambdalayers. I tried to implement a LambdaUNet where I changed the 3x3conv layers with lambdalayers and avg pooling.

The Conv-UNet has 17Mio parameters while the LambdaUNet only 3Mio. Still, inference and training take much longer in the LambdaUNet than in the ConvUNet (approx 1s ConvUnet vs 10s Lamndaunet). I also used a receptive field of r=23. I am not sure where this parameter originates from or what receptive field should be set. In the paper, the authors talk about "controlled experiments". I assume they chose the lambdalayer hyperparameter (in some way) similar to the conv parameters? It is not very clear from the paper (at least from my initial reading).

I was wondering if others share my experience on slower training and inference time when blindly changing conv layer with lambda layers. Maybe someone can share his expertise on how I can control my LambdaUnet to be comparable to a regular UNet to reproduce the performance and efficiency results from the paper. Thanks again

MarcCoru avatar Oct 20 '20 00:10 MarcCoru

@MarcCoru the paper hasn't even been reviewed yet, so I think we are all in uncharted territories. Let's just keep this open so people can add to the discussions

lucidrains avatar Oct 20 '20 02:10 lucidrains

Table 12 in the openreview version's appendix has provide references to the inference speed. Seems with more convolution layers replaced, the inference drastically slows down.

yjxiong avatar Oct 20 '20 22:10 yjxiong