pytorch-grad-cam
pytorch-grad-cam copied to clipboard
Support for 3D Conv-Net
Hi all,
Thank you for developing such a nice repo. I've been using it in many of my projects for network explainability, and it has been incredibly convenient!
Recently, I've been working with medical datasets using 3D-UNet. However, I noticed that 3D convolution is not yet supported in this library, and there are also some issues like #351 requesting for the feature. Therefore, I made several changes on GradCAM and BaseCAM to extend the functionality of GradCAM to support 3D images.
Please let me know if you have any questions or suggestions regarding the changes I've implemented. I'm excited to contribute to this project and look forward to your feedback!
Hey, sorry for the late reply. Thanks a lot for this functionality, this will be great to merge.
Is there a way to share an example use case for this: maybe some model and and input image example, or an image example for the readme?
@jacobgil Thanks for your reply!
Is there a way to share an example use case for this: maybe some model and and input image example, or an image example for the readme?
I added an animation of gradcam-visualized CT scans in the readme. Hope this can make it clearer.
@kevinkevin556 Thanks for providing the code for applying Grad-Cam on 3D CNN!
I have used your code to get the grad-cam outputs, my input 3D image tensor size is (1, 1, 24, 224, 224) representing (batch, channel, depth, height, width). Then I got the grayscale_cam outputs size is (1, 24, 224, 224). I'm curious to know if I take one of the outputs, for example, depth=11, the output will be "outputs[ 0, : ][ 11, : , : ] (depth, height, width)", will it corresponds to "input image[ : , 11 , : , : ] (channel, depth, height, width)" ? Since I found that every depth of the output heatmap looked same.
Looking forward to your replying, thanks!
I have used your code to get the grad-cam outputs, my input 3D image tensor size is (1, 1, 24, 224, 224) representing (batch, channel, depth, height, width). Then I got the grayscale_cam outputs size is (1, 24, 224, 224). I'm curious to know if I take one of the outputs, for example, depth=11, the output will be "outputs[ 0, : ][ 11, : , : ] (depth, height, width)", will it corresponds to "input image[ : , 11 , : , : ] (channel, depth, height, width)" ? Since I found that every depth of the output heatmap looked same.
@Syax19 Sorry for the late reply. I'm glad to hear that someone is using it 😄
Although I followed MONAI's convention to assign each dimension in the order of (height, width, depth), the output dimensions should still correspond with your input tensor, as there is no dimension swap when calculating Grad-CAM.
Therefore, the grayscale_cam
of size (1, 24, 224, 224)
represents dimensions (batch, depth, height, width)
in your case.
@jacobgil Any update on this feature?
This is incredible functionality, thank you so much for contributing this, and sorry for being so late with my reply. I really want to merge this. The .gif file weights 24 mb which is a bit much, will look into resizing it.
@kevinkevin556 merged!! better late than never. Thank you so much for this contribution!