pytorch-retinanet icon indicating copy to clipboard operation
pytorch-retinanet copied to clipboard

Question about input size

Open FengHaobo opened this issue 3 years ago • 1 comments

What is the input size of the network, I found that there is a Resize class, which is (608, 1024), is the input size also (608, 1024)? But from the code, it seems that the input is not a fixed size @yhenon `class Resizer(object): """Convert ndarrays in sample to Tensors."""

def __call__(self, sample, min_side=608, max_side=1024):
    image, annots = sample['img'], sample['annot']

    rows, cols, cns = image.shape

    smallest_side = min(rows, cols)

    # rescale the image so the smallest side is min_side
    scale = min_side / smallest_side

    # check if the largest side is now greater than max_side, which can happen
    # when images have a large aspect ratio
    largest_side = max(rows, cols)

    if largest_side * scale > max_side:
        scale = max_side / largest_side

    # resize the image with the computed scale
    image = skimage.transform.resize(image, (int(round(rows*scale)), int(round((cols*scale)))))
    rows, cols, cns = image.shape

    pad_w = 32 - rows%32
    pad_h = 32 - cols%32

    new_image = np.zeros((rows + pad_w, cols + pad_h, cns)).astype(np.float32)
    new_image[:rows, :cols, :] = image.astype(np.float32)

    annots[:, :4] *= scale

    return {'img': torch.from_numpy(new_image), 'annot': torch.from_numpy(annots), 'scale': scale}` 

FengHaobo avatar Apr 25 '22 09:04 FengHaobo

I think the input size is not fix, it must be multiple of 32, and every image in the same batch have same size by padding.

junhai0428 avatar Jun 30 '22 06:06 junhai0428