pytorch-grad-cam icon indicating copy to clipboard operation
pytorch-grad-cam copied to clipboard

the multi-input issue

Open zhouqunbing opened this issue 2 years ago • 9 comments

hello,if i want to use two inputs to segment picture,how should i do to fix the code? eg: RGB:(B,3,H,W) Depth:(B,1,H,W) I use the RGB and depth as the input,finally i will get the segmentation map.so i wanna to see the CAM map of some layers in the net。

zhouqunbing avatar Oct 08 '22 09:10 zhouqunbing

Hi, Some clarifying questions:

  • Do you want to get the CAM with respect to the RGB pixels, the Depth pixels, or both?
  • Is the input tensor actually B, 4, H, W (4=3+1 i.e they are concatenated), or is it something else more complex ?

jacobgil avatar Oct 08 '22 14:10 jacobgil

Hi, Some clarifying questions:

  • Do you want to get the CAM with respect to the RGB pixels, the Depth pixels, or both?
  • Is the input tensor actually B, 4, H, W (4=3+1 i.e they are concatenated), or is it something else more complex ?

thank you for your repaly. 1:I wanna get the CAM of both or the RGB pixel. 2:the input rgb and depth are feed into resnet sepreately,for example the rgb feed into resnet,and the depth feed into another resnet(two resnet are same),then i will operate in resnet ,finaly i will get one output,the dimision is (B,512,H//32,W//32).

zhouqunbing avatar Oct 09 '22 01:10 zhouqunbing

I'm not sure what it means separately. If the depth is grayscale (one channel), how can it be fed into the same resnet? will it be duplicated to have 3 channels?

The details of this matter for the solution. Is there a chance you can provide some code or pseudo code on how the forward pass looks like ?

jacobgil avatar Oct 09 '22 05:10 jacobgil

I'm not sure what it means separately. If the depth is grayscale (one channel), how can it be fed into the same resnet? will it be duplicated to have 3 channels?

The details of this matter for the solution. Is there a chance you can provide some code or pseudo code on how the forward pass looks like ?

1:the depth is a gary picture ,it's pixel is range from 0 to 255,but the value of each pixel satnds for the distance between camera and object. 2:the strcture may be like this: image 3:the forward function is like this:

def forward(self, rgb, depth):
    rgb = self.encoder_rgb.forward_first_conv(rgb)
    depth = self.encoder_depth.forward_first_conv(depth)

    if self.fuse_depth_in_rgb_encoder == 'add':
        fuse = rgb + depth
    else:
        fuse = self.se_layer0(rgb, depth)

    rgb = F.max_pool2d(fuse, kernel_size=3, stride=2, padding=1)
    depth = F.max_pool2d(depth, kernel_size=3, stride=2, padding=1)

    # block 1
    rgb = self.encoder_rgb.forward_layer1(rgb)
    depth = self.encoder_depth.forward_layer1(depth)
    if self.fuse_depth_in_rgb_encoder == 'add':
        fuse = rgb + depth
    else:
        fuse = self.se_layer1(rgb, depth)
    skip1 = self.skip_layer1(fuse)

    # block 2
    rgb = self.encoder_rgb.forward_layer2(fuse)
    depth = self.encoder_depth.forward_layer2(depth)
    if self.fuse_depth_in_rgb_encoder == 'add':
        fuse = rgb + depth
    else:
        fuse = self.se_layer2(rgb, depth)
    skip2 = self.skip_layer2(fuse)

    # block 3
    rgb = self.encoder_rgb.forward_layer3(fuse)
    depth = self.encoder_depth.forward_layer3(depth)
    if self.fuse_depth_in_rgb_encoder == 'add':
        fuse = rgb + depth
    else:
        fuse = self.se_layer3(rgb, depth)
    skip3 = self.skip_layer3(fuse)

    # block 4
    rgb = self.encoder_rgb.forward_layer4(fuse)
    depth = self.encoder_depth.forward_layer4(depth)
    if self.fuse_depth_in_rgb_encoder == 'add':
        fuse = rgb + depth
    else:
        fuse = self.se_layer4(rgb, depth)
    

zhouqunbing avatar Oct 09 '22 06:10 zhouqunbing

Got it ! If self.fuse_depth_in_rgb_encoder is not 'add', I think the easiest way to start, would be to set target_layer = se_layer4. It would then visualize the combined heatmap for both of them.

If you want to visualize only RGB for example, you can set target_layer = encoder_rgb.forward_layer4.

Does this work ?

jacobgil avatar Oct 12 '22 07:10 jacobgil

@jacobgil thank you for your reply。 Practice is the sole criterion for testing truth。 I will revise the code 。

zhouqunbing avatar Oct 12 '22 07:10 zhouqunbing

Hello, do you solve this problem? I also need to visualize the heat map of the siamese network

IceHowe avatar Apr 03 '23 03:04 IceHowe

Hello, do you solve this problem? I also need to visualize the heat map of the siamese network

hello,if i want to use two inputs to segment picture,how should i do to fix the code? eg: RGB:(B,3,H,W) Depth:(B,1,H,W) I use the RGB and depth as the input,finally i will get the segmentation map.so i wanna to see the CAM map of some layers in the net。

请问问题解决了吗 两个输入是怎么进行热力图计算呢

chen-yuu avatar Dec 20 '23 09:12 chen-yuu

对不起,没有解决,后来没用到了。

---Original--- From: @.> Date: Wed, Dec 20, 2023 17:54 PM To: @.>; Cc: @.@.>; Subject: Re: [jacobgil/pytorch-grad-cam] the multi-input issue (Issue #341)

Hello, do you solve this problem? I also need to visualize the heat map of the siamese network

hello,if i want to use two inputs to segment picture,how should i do to fix the code? eg: RGB:(B,3,H,W) Depth:(B,1,H,W) I use the RGB and depth as the input,finally i will get the segmentation map.so i wanna to see the CAM map of some layers in the net。

请问问题解决了吗 两个输入是怎么进行热力图计算呢

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

zhouqunbing avatar Dec 20 '23 10:12 zhouqunbing