PSPNet
PSPNet copied to clipboard
Questions about PSPNet.
Thank you for uploading your code. It is very helpful to understand PSPNet. I have two questions about your paper.
- You wrote
we use a pretrained ResNet model with the dilated network strategy to extract the feature map. The final feature map size is 1/8 of the input image.
in the paper. But I think the feature map size is 1/16 when you use ResNet50. Do you use only first 3 blocks of ResNet50?
- You wrote
Then we directly upsample the low-dimension feature maps to get the same size feature as the original feature map via bilinear interpolation. Finally, different levels of features are concatenated as the final pyramid pooling global feature.
in Section 3.2 in the paper. I understand we have to concatenate resized different levels of features and feature map extracted by ResNet 50. But after that, the image size is 1/8 of the input image. How did you resize them to the same image size as input image?
I have same question.
The output of segmentation map is 1/8 and use bilinear upsampling to recover the original size.
Hi there,
But I think the feature map size is 1/16 when you use ResNet50. Do you use only first 3 blocks of ResNet50?
To get 1/8 of the input size don't use a simple ResNet. You should use a DilatedResnet (https://arxiv.org/abs/1705.09914)
But after that, the image size is 1/8 of the input image. How did you resize them to the same image size as input image?
Yeah you're right there. Thats definitely not described well in the paper. For my implementation i did the following: Upscale all the pooling layers so that they have the same width/height as the output of the dilated resnet. Then concat them all. Add two convs and then upsample this 8 times to get the original image size
@kazucmpt @shentanyue These are probably best clarified by referring to the official code. (And a good thing about Caffe is that the network architecture is fully and clearly laid out in a human-friendly text file ;P)
In the very end of their provided network definition files (see evaluation/prototxt directory), you will see that their networks are terminated with an Interp
layer that upsamples the bottom blob by 8 times spatially:
layer {
name: "conv6_interp"
type: "Interp"
bottom: "conv6"
top: "conv6_interp"
interp_param {
zoom_factor: 8
}
}
And if you would like more details on the Interp
layer, you can check out its source code.
Would the upsample layer replace interp layer ? My device do not support it.
Hi, Here in this architecture we created bins of sizes 1x1x512->1x1x1, 2x2x1, 3x3x1, 6x6x1. 1/8 of original feature map is 28. How to Upsample features from 3x3 to 28x28. I tried with so many integer values. How to do upsampling