DBNet.pytorch
DBNet.pytorch copied to clipboard
validation image is not resized by multiple of 32
I found that validation images are resized at class ResizeShortSize, data_loader/modules/augment.py.
The short side is resized to 736px (multiple of 32) by config setting, but long side is not guaranteed to resized by multiple of 32.
It may causes shifting of detected region and effect to the precision.
At predict.py, the input image is resized by multiple of 32.
So validation image also should be resized by multiple of 32.
mark
使用这一行https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/augment.py#L226
替代这一行https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/augment.py#L225 就行,但是我没测试
In my test(icdar2015), the width and height are all multiples of 32 will reduce f1
Thank you for your commit. But it seems that 32 alignment only works when short side of image is less than 736 (config setting). So I suppose some large images of your test data are not aligned to 32 ?
if short_edge < self.short_size:
# 保证短边 >= short_size
scale = self.short_size / short_edge
im = cv2.resize(im, dsize=None, fx=scale, fy=scale)
scale = (scale, scale)
# im, scale = resize_image(im, self.short_size)
The down sampling layer of FPN has stride 2 convolution, 5 times. It reduces the input image size to 1/32 (2^5) . So If the input size is not divisible by 32, it causes a fraction and will be out of alignment. This is why most semantic segmentation (FPN, FCN etc..) resize the input size to divisible by 32.
Comment out this line of code and the result is the same
if short_edge < self.short_size:
hmmm, I don't know what happened. May be some other problem surfaced by 32 alignment reduced the score. I think 32 alignment itself has no bad effect...
作者你好,请问有可以训练多点坐标(非四点)的教程吗。。。