pytorch-PCN icon indicating copy to clipboard operation
pytorch-PCN copied to clipboard

Very strange processing sequence in stage1

Open hcl14 opened this issue 2 years ago • 0 comments

This is the loop for your stage1 which invokes first model repeatedly:

def stage1(img, imgPad, net, thres):
    row = (imgPad.shape[0] - img.shape[0]) // 2
    col = (imgPad.shape[1] - img.shape[1]) // 2
    winlist = []
    netSize = 24
    curScale = minFace_ / netSize
    img_resized = resize_img(img, curScale)
    while min(img_resized.shape[:2]) >= netSize:
        img_resized = preprocess_img(img_resized)   <---------------- !!!
        # net forward
        net_input = set_input(img_resized)
        with torch.no_grad():
            net.eval()
            cls_prob, rotate, bbox = net(net_input)

        w = netSize * curScale
        for i in range(cls_prob.shape[2]): # cls_prob[2]->height
            for j in range(cls_prob.shape[3]): # cls_prob[3]->width
                if cls_prob[0, 1, i, j].item() > thres:
                    sn = bbox[0, 0, i, j].item()
                    xn = bbox[0, 1, i, j].item()
                    yn = bbox[0, 2, i, j].item()
                    rx = int(j * curScale * stride_ - 0.5 * sn * w + sn * xn * w + 0.5 * w) + col
                    ry = int(i * curScale * stride_ - 0.5 * sn * w + sn * yn * w + 0.5 * w) + row
                    rw = int(w * sn)
                    if legal(rx, ry, imgPad) and legal(rx + rw - 1, ry + rw -1, imgPad):
                        if rotate[0, 1, i, j].item() > 0.5:
                            winlist.append(Window2(rx, ry, rw, rw, 0, curScale, cls_prob[0, 1, i, j].item()))
                        else:
                            winlist.append(Window2(rx, ry, rw, rw, 180, curScale, cls_prob[0, 1, i, j].item()))
        img_resized = resize_img(img_resized, scale_)  <---------------- !!!
        curScale = img.shape[0] / img_resized.shape[0]
    return winlist

I have marked repeated application of preprocess_img to img_resized. This function subtracts mean all the time so you can check that image mean shifts towards big negative numbers:

def preprocess_img(img, dim=None):
    if dim:
        img = cv2.resize(img, (dim, dim), interpolation=cv2.INTER_NEAREST)
    return img - np.array([104, 117, 123])

But you infer through the same network all the time, so this isn't possible that you have to feed different inputs.

I have moved preprocessing out of the loop, and the result is still ok. Is it a bug?

hcl14 avatar May 12 '22 11:05 hcl14