pytorch-android
pytorch-android copied to clipboard
Help on preprocessing
Dear Cedric, thank you so much for this repo, it's been very very helpful. I am using a ResNet18 model for binary classification on my android app, I also made some changes on java code to capture a still image instead of a preview. I am stuck at pre-processing the images. In my original PyTorch code I had:
transforms.Resize(256)
transforms.CenterCrop(224)
transforms.ToTensor()
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
I see in the cpp code some means b_mean = 104.00
, g_mean = 116.66
and r_mean = 122.67
. I'm not sure what these means are and what is the processing happening in the snippet below. I need help to preprocess the images, for now my output doesn't make sense at all. Thank you so much for help.
float b_mean = 104.00698793f;
float g_mean = 116.66876762f;
float r_mean = 122.67891434f;
auto b_i = 0 * IMG_H * IMG_W + j * IMG_W + i;
auto g_i = 1 * IMG_H * IMG_W + j * IMG_W + i;
auto r_i = 2 * IMG_H * IMG_W + j * IMG_W + i;
if (infer_HWC) {
b_i = (j * IMG_W + i) * IMG_C;
g_i = (j * IMG_W + i) * IMG_C + 1;
r_i = (j * IMG_W + i) * IMG_C + 2;
}
//R = Y + 1.402 (V-128)
//G = Y - 0.34414 (U-128) - 0.71414 (V-128)
// B = Y + 1.772 (U-V)
input_data[r_i] = -r_mean + (float) ((float) min(255., max(0., (float) (y + 1.402 * (v - 128)))));
input_data[g_i] = -g_mean + (float) ((float) min(255., max(0., (float) (y - 0.34414 * (u - 128) - 0.71414 * (v - 128)))));
input_data[b_i] = -b_mean + (float) ((float) min(255., max(0., (float) (y + 1.772 * (u - v)))));