pytorch-android icon indicating copy to clipboard operation
pytorch-android copied to clipboard

Help on preprocessing

Open wlouhichi opened this issue 5 years ago • 0 comments

Dear Cedric, thank you so much for this repo, it's been very very helpful. I am using a ResNet18 model for binary classification on my android app, I also made some changes on java code to capture a still image instead of a preview. I am stuck at pre-processing the images. In my original PyTorch code I had:

transforms.Resize(256)
transforms.CenterCrop(224)
transforms.ToTensor()
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

I see in the cpp code some means b_mean = 104.00, g_mean = 116.66 and r_mean = 122.67. I'm not sure what these means are and what is the processing happening in the snippet below. I need help to preprocess the images, for now my output doesn't make sense at all. Thank you so much for help.

            float b_mean = 104.00698793f;
            float g_mean = 116.66876762f;
            float r_mean = 122.67891434f;

            auto b_i = 0 * IMG_H * IMG_W + j * IMG_W + i;
            auto g_i = 1 * IMG_H * IMG_W + j * IMG_W + i;
            auto r_i = 2 * IMG_H * IMG_W + j * IMG_W + i;

            if (infer_HWC) {
                b_i = (j * IMG_W + i) * IMG_C;
                g_i = (j * IMG_W + i) * IMG_C + 1;
                r_i = (j * IMG_W + i) * IMG_C + 2;
            }

            //R = Y + 1.402 (V-128)
            //G = Y - 0.34414 (U-128) - 0.71414 (V-128)
            // B = Y + 1.772 (U-V)
            input_data[r_i] = -r_mean + (float) ((float) min(255., max(0., (float) (y + 1.402 * (v - 128)))));
            input_data[g_i] = -g_mean + (float) ((float) min(255., max(0., (float) (y - 0.34414 * (u - 128) - 0.71414 * (v - 128)))));
            input_data[b_i] = -b_mean + (float) ((float) min(255., max(0., (float) (y + 1.772 * (u - v)))));

wlouhichi avatar Jan 28 '19 16:01 wlouhichi