DBNet.pytorch icon indicating copy to clipboard operation
DBNet.pytorch copied to clipboard

validation image is not resized by multiple of 32

Open nai-kon opened this issue 5 years ago • 8 comments

I found that validation images are resized at class ResizeShortSize, data_loader/modules/augment.py. The short side is resized to 736px (multiple of 32) by config setting, but long side is not guaranteed to resized by multiple of 32. It may causes shifting of detected region and effect to the precision. At predict.py, the input image is resized by multiple of 32. So validation image also should be resized by multiple of 32.

nai-kon avatar May 15 '20 02:05 nai-kon

mark

DYJNG avatar May 16 '20 08:05 DYJNG

使用这一行https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/augment.py#L226

替代这一行https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/augment.py#L225 就行,但是我没测试

WenmuZhou avatar Jun 07 '20 09:06 WenmuZhou

In my test(icdar2015), the width and height are all multiples of 32 will reduce f1

WenmuZhou avatar Jun 10 '20 02:06 WenmuZhou

Thank you for your commit. But it seems that 32 alignment only works when short side of image is less than 736 (config setting). So I suppose some large images of your test data are not aligned to 32 ?

        if short_edge < self.short_size:
            # 保证短边 >= short_size
            scale = self.short_size / short_edge
            im = cv2.resize(im, dsize=None, fx=scale, fy=scale)
            scale = (scale, scale)
            # im, scale = resize_image(im, self.short_size)

nai-kon avatar Jun 10 '20 10:06 nai-kon

The down sampling layer of FPN has stride 2 convolution, 5 times. It reduces the input image size to 1/32 (2^5) . So If the input size is not divisible by 32, it causes a fraction and will be out of alignment. This is why most semantic segmentation (FPN, FCN etc..) resize the input size to divisible by 32.

nai-kon avatar Jun 10 '20 10:06 nai-kon

Comment out this line of code and the result is the same if short_edge < self.short_size:

WenmuZhou avatar Jun 11 '20 00:06 WenmuZhou

hmmm, I don't know what happened. May be some other problem surfaced by 32 alignment reduced the score. I think 32 alignment itself has no bad effect...

nai-kon avatar Jun 11 '20 11:06 nai-kon

作者你好,请问有可以训练多点坐标(非四点)的教程吗。。。

PanFei748 avatar Jul 08 '20 06:07 PanFei748