EfficientNet-PyTorch
EfficientNet-PyTorch copied to clipboard
Issues with applying GradCam on this implementation
Hello there, so I've been working with this repo for quite a time now, can say since it was released. Trying to extract GradCam images as explained here https://arxiv.org/abs/1610.02391 I have quite some problems. So some CAM images look quite nice but some constantly occuring problems are present. The major one is checkerboard patterns, as can be seen below:

I'm %99 sure my GradCAM implementation is correct yet I suspect some things causing this, it can be the padded conv2d functions or maybe swish. I generate the images from the features just before the global average pooling, with EffNet-B7 the shape of the tensor is 1, 2560, 18, 18. So I figured B0 does not do this, or if I pick some lower level feature tensor again the checkerboard pattern vanishes yet then the focus is on the low features of the images. Anyone have any idea? Does the original implementation has the same problem?
Huh, thanks for the interesting issue. I don't know whether applying GradCAM to the original implementation gives the same results, and I would be interested to find out. If anyone else has tried this, drop a comment!
Hello, since I opened the issue I dug a bit deeper, so I collected all the outputs from the MBConvBlock for an image, also the output of "head" convolution, calculated the channel-wise mean, and saved the images. So for EfficientNet there is 55 self._blocks output and 1 head output.
There is almost no problem in the first 5 outputs yet the 6th starts to show checkerboard patterns as can be seen below:
It goes like this until the downsampling and the checkerboard patterns get sharper:
Then after downsampling speaking for myself I can not detect checkerboard patterns anymore as a human, here is the output:

So this is as far as this experiment goes yet since you are interested I would like to provide more observations. https://github.com/sidml/EfficientNet-GradCam-Visualization -> In this repo there is another person who tried GradCAM and has similar results, he probably did not use the advised input resolution 600x600 thus his checkerboard patterns do not look as severe, yet another thing to notice here is if you look at the EfficientNet-B3 GradCAM results there is almost always a highlighted point in the bottom right corner, this is probably due to the input res (224x224) because I tried the same input size before creating gradcam for another project then get the same corner highlight, fixed by changing the resolution.
I wrote this in the hope it will help somebody figure the problem out :D Have a good day.
This weird bottom-right highlight happens in EfficientNet's paper itself:

This weird bottom-right highlight happens in EfficientNet's paper itself:
You are right, I've never noticed that probably the writers did not notice "that particular" example.