pytorch-grad-cam Can GradCAM be applied to CNN regression models with multiple input images?

Hello, I am working on a CNN model designed for regression tasks with multiple input images (channels). The goal is to predict a target output image based on multiple predictor images. I would like to know if GradCAM can be used to quantify the contribution of each input image to the predictions. If so, could you please provide guidance or examples on how to implement this? Thank you very much for your time and assistance!

Dec 09 '24 11:12 njujinchun

Hi, Sorry for the late reply, hope it is still relevant.

Do you want to quantify the contribution of each input image, or do you want visual contributions inside each image (e.g, highlighting which pixels contributed more in image #3) ?

For the first option you might need something custom, depending on how the images look. Option way would be to use ablation - zero out (or another similar ablation where you replace with a constant value, or smooth the image) each channel, run the model, and check the confidence drop.

Dec 20 '24 17:12 jacobgil

Hi,

Thank you very much for your reply and clarification. I greatly appreciate your time and assistance.

The problem I am addressing involves using a CNN to model the regression relationship between remotely sensed evapotranspiration (target) and temperature & precipitation (input) images for land regions (with ocean pixels assigned a constant value of zero). Specifically, I am interested in identifying which input pixels contribute the most to predicting evapotranspiration in a region like the Amazon River Basin.

Based on your explanation, it seems my problem aligns with the second case you mentioned—analyzing visual contributions within each image.

Thank you again for your insights, and I look forward to any further guidance or suggestions you may have.

Best regards,

Shaoxing

Dec 20 '24 22:12 njujinchun

I would start with the low hanging fruit, which is getting a CAM image that identifies any relevant pixels. Does the model have a single output (the regression output) ? If yes, you can use the RawScoresOutputTarget to get a CAM for pixels that promote a higher regression output.

Following the example in the Readme:

targets = [RawScoresOutputTarget()]
with GradCAM(model=model, target_layers=target_layers) as cam:
  grayscale_cam = cam(input_tensor=input_tensor, targets=targets)

The problem is that this does not tell us if the pixel was important for the temperature image, the precipitation, or both of them. It will only tell us it was relevant for at least one of them. If this is important for you, maybe one thing you could do is zero out the temperature image (or an alternative ablation that makes sense to you, like replacing it with a constant value or blurring it), and then get the CAM, which should correspond to the CAM for precipitation.

Dec 21 '24 12:12 jacobgil

Thank you very much for your detailed reply and explanations. I apologize for the delay in responding, as I just returned from holiday. My model produces only a single output image. I will implement your suggestions and let you know if I need further assistance. Thank you again for your support—I truly appreciate it!

Dec 30 '24 19:12 njujinchun

pytorch-grad-cam pytorch-grad-cam copied to clipboard

Can GradCAM be applied to CNN regression models with multiple input images?

pytorch-grad-cam
pytorch-grad-cam copied to clipboard