ICNet-tensorflow icon indicating copy to clipboard operation
ICNet-tensorflow copied to clipboard

image size problems

Open ifangcheng opened this issue 7 years ago • 1 comments

  1. image size problems in training: When the image size in my own dataset is small (e.g. h=200, w=80), when I train the model, how should I set the INPUT_SIZE? INPUT_SIZE='720,720' or INPUT_SIZE='480,480' or it should be: INPUT_SIZE='200, 80' ?

  2. image size problems for inference:
    When I run inference.py with a smaller size input image (e.g. h=212, w=87), the image is padding to 224,96, then some error came up: something like "stride must be >0 got 0 for conv5_3_pool6 ..." However, if I try inference image with bigger size (e.g.360,480), everything works well. So, is there still any limits for the input image size? can not support any arbitrary image size?

ifangcheng avatar Feb 28 '18 12:02 ifangcheng

Hey @ifangcheng, I think the problem occurs at model.py from line468-482:

        (self.feed('conv5_3/relu')
             .avg_pool(h, w, h, w, name='conv5_3_pool1')
             .resize_bilinear(shape, name='conv5_3_pool1_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(h/2, w/2, h/2, w/2, name='conv5_3_pool2')
             .resize_bilinear(shape, name='conv5_3_pool2_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(h/3, w/3, h/3, w/3, name='conv5_3_pool3')
             .resize_bilinear(shape, name='conv5_3_pool3_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(h/4, w/4, h/4, w/4, name='conv5_3_pool6')
             .resize_bilinear(shape, name='conv5_3_pool6_interp'))

So, the default minimum size for input images is: output strides 32 * pooling strides 4 = 128 But you can specify these values to support smaller images, for examples: change .avg_pool(h/4, w/4, h/4, w/4, name='conv5_3_pool6') into .avg_pool(3, 3, 3, 3, name='conv5_3_pool6')

hellochick avatar Mar 07 '18 03:03 hellochick