pytorch-grad-cam icon indicating copy to clipboard operation
pytorch-grad-cam copied to clipboard

CCT-MODEL: axis 3 is out of bounds for array of dimension 3

Open rzamarefat opened this issue 2 years ago • 5 comments

Hi, thank you for this awesome repo. I am trying to get the GradCAM for the CCT (Compact Convolutional Transformer) taken from here. I give the model a tensor of [1, 3, 224, 224] and the following error comes up: File "/home/marefat/projects/NSFW/venv/lib/python3.8/site-packages/numpy/core/_methods.py", line 78, in _count_reduce_items items *= arr.shape[mu.normalize_axis_index(ax, arr.ndim)] numpy.AxisError: axis 3 is out of bounds for array of dimension 3 I am not sure about the following target_layers but I have tried many different layers and I still get the error.

target_layers = [model.cct_model.classifier.blocks[-1].norm1]

Any help would be appreciated.

rzamarefat avatar Oct 11 '22 13:10 rzamarefat

I have included a reshape_transform in GradCAM and it solved. But how can I set the width and height argument in reshape_transform function correctly?

rzamarefat avatar Oct 11 '22 14:10 rzamarefat

Hi,

Can you please clarify the question:)

You can define your own reshape_transform and pass it. Do you know the width/height in advance ?

jacobgil avatar Oct 15 '22 16:10 jacobgil

The main problem is that I have provided GradCAM instance a reshape_transform function but I don't know the suitable width and height. imagine that I have set the width and height to 10. The following error happens:

RuntimeError: shape '[1, 10, 10, 384]' is invalid for input of size 74880

if I set them to, for instance, width=20 and height=30 I got the following error:

RuntimeError: shape '[1, 30, 20, 384]' is invalid for input of size 74880

So my solution to this is that I divide 74880 by 384 and get 195. Now the number "195" can be expressed as 13*15 and I set width and height arguments of reshape_transform funstion to 13 and 15 respectively and it works. BUT AS YOU CAN SEE THIS DOES NOT HAVE ANY LOGIC and I think the resulted grad cam that I get using this approach is completely misleading and incorrect.

Please note that the input of my model is (batch, 3, 224, 224)

rzamarefat avatar Oct 20 '22 14:10 rzamarefat

any ideas for solving the issue would be appreciated.

rzamarefat avatar Dec 13 '22 11:12 rzamarefat

Have you save this problem?

Rainydu184 avatar Aug 10 '23 15:08 Rainydu184