ICNet-tensorflow
ICNet-tensorflow copied to clipboard
image size problems
-
image size problems in training: When the image size in my own dataset is small (e.g. h=200, w=80), when I train the model, how should I set the INPUT_SIZE? INPUT_SIZE='720,720' or INPUT_SIZE='480,480' or it should be: INPUT_SIZE='200, 80' ?
-
image size problems for inference:
When I run inference.py with a smaller size input image (e.g. h=212, w=87), the image is padding to 224,96, then some error came up: something like "stride must be >0 got 0 for conv5_3_pool6 ..." However, if I try inference image with bigger size (e.g.360,480), everything works well. So, is there still any limits for the input image size? can not support any arbitrary image size?
Hey @ifangcheng,
I think the problem occurs at model.py
from line468-482
:
(self.feed('conv5_3/relu')
.avg_pool(h, w, h, w, name='conv5_3_pool1')
.resize_bilinear(shape, name='conv5_3_pool1_interp'))
(self.feed('conv5_3/relu')
.avg_pool(h/2, w/2, h/2, w/2, name='conv5_3_pool2')
.resize_bilinear(shape, name='conv5_3_pool2_interp'))
(self.feed('conv5_3/relu')
.avg_pool(h/3, w/3, h/3, w/3, name='conv5_3_pool3')
.resize_bilinear(shape, name='conv5_3_pool3_interp'))
(self.feed('conv5_3/relu')
.avg_pool(h/4, w/4, h/4, w/4, name='conv5_3_pool6')
.resize_bilinear(shape, name='conv5_3_pool6_interp'))
So, the default minimum size for input images is:
output strides 32 * pooling strides 4 = 128
But you can specify these values to support smaller images, for examples:
change .avg_pool(h/4, w/4, h/4, w/4, name='conv5_3_pool6')
into .avg_pool(3, 3, 3, 3, name='conv5_3_pool6')