pytorch-retinanet Question about input size

Question about input size

Open FengHaobo opened this issue 3 years ago • 1 comments

What is the input size of the network, I found that there is a Resize class, which is (608, 1024), is the input size also (608, 1024)? But from the code, it seems that the input is not a fixed size @yhenon `class Resizer(object): """Convert ndarrays in sample to Tensors."""

def __call__(self, sample, min_side=608, max_side=1024):
    image, annots = sample['img'], sample['annot']

    rows, cols, cns = image.shape

    smallest_side = min(rows, cols)

    # rescale the image so the smallest side is min_side
    scale = min_side / smallest_side

    # check if the largest side is now greater than max_side, which can happen
    # when images have a large aspect ratio
    largest_side = max(rows, cols)

    if largest_side * scale > max_side:
        scale = max_side / largest_side

    # resize the image with the computed scale
    image = skimage.transform.resize(image, (int(round(rows*scale)), int(round((cols*scale)))))
    rows, cols, cns = image.shape

    pad_w = 32 - rows%32
    pad_h = 32 - cols%32

    new_image = np.zeros((rows + pad_w, cols + pad_h, cns)).astype(np.float32)
    new_image[:rows, :cols, :] = image.astype(np.float32)

    annots[:, :4] *= scale

    return {'img': torch.from_numpy(new_image), 'annot': torch.from_numpy(annots), 'scale': scale}`

Apr 25 '22 09:04 FengHaobo

I think the input size is not fix, it must be multiple of 32, and every image in the same batch have same size by padding.

Jun 30 '22 06:06 junhai0428

pytorch-retinanet pytorch-retinanet copied to clipboard

Question about input size

pytorch-retinanet
pytorch-retinanet copied to clipboard