opencv Replicate PyTorch transforms like transforms.ToTensor() and transforms.Normalize() in opencv c++

trafficstars

Issue description

In my development code, I have pytorch transforms on my test dataset (torchvision.transforms), I have exported the model into ONNX and I want to replicate these transforms with OpenCV c++. but, when i make predictions using pytorch (with the exported ONNX model), I get different result than when making predictions with OpenCV c++

PyTorch transforms


test_transforms = transforms.Compose([
                           transforms.Resize(pretrained_size),
                           transforms.CenterCrop(pretrained_size),
                           transforms.ToTensor(),
                           transforms.Normalize(mean = pretrained_means, 
                                                std = pretrained_stds)
                       ])

Output of Inference with PyTorch:

pytorch image type: torch.float32
detected class: cross-over
confidence: 0.8553193807601929
output probabilities of softmax layer: [[2.9558592e-05 8.5531938e-01 6.2924426e-04 1.1440608e-02 5.4786936e-04
  4.9833752e-02 2.7969838e-04 8.1919849e-02]]

My Inference with OpenCV c++

    py::dict VClassdict;

    // Convert input Image from Python Code to Numpy Array
    Mat frame = nparray_to_mat(img);

    // Color Conversion to RGB
    Mat Img;
    //cvtColor(frame, Img, COLOR_BGR2RGB);

    // Resize Image to ResNet Input size
    cv::resize(frame, Img, Size(ClassinpWidth, ClassinpWidth));

    Mat normImg;
    Img.convertTo(normImg, CV_32FC3, 1.f / 255);
    Scalar mean;
    Scalar std;
    cv::meanStdDev(normImg, mean, std);

    mean[0] *= 255.0;
    mean[1] *= 255.0;
    mean[2] *= 255.0;

    double scale_factor = 0.003921569;  // equivalent to 1/255
   // Scalar mean = Scalar(117.8865, 128.52, 137.0115); // each channel mean is multiplied by 255.0
    //Scalar std = Scalar(0.2275, 0.2110, 0.2140);
    bool swapRB = true;
    bool crop = false;

    Mat blob;
    cv::dnn::blobFromImage(Img, blob, scale_factor, Size(ClassinpWidth, ClassinpWidth), mean, swapRB, crop);

    if (std.val[0] != 0.0 && std.val[1] != 0.0 && std.val[2] != 0.0)
    {
        // Divide blob by std.
        divide(blob, std, blob);
    }

    VClassResNet_Net.setInput(blob);

    // predict
    Mat prob = VClassResNet_Net.forward();

    cout << "output probabilities of softmax layer: " << prob;
    cout << endl;

    // extract prediction with highest confidence
    Point classIdPoint;
    double confidence;
    minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
    int classId = classIdPoint.x;

    // Setup Dict returned to Python code for detections
    VClassdict["ObjectClass"] = VClassResNet_classes[classId].c_str(); // Detected Class
    VClassdict["Confidence"] = confidence; // Confedence Level

Output of OpenCV c++ Inference:

detected class: cross-over
confidence: 0.9028045535087585
output probabilities of softmax layer: [2.5416075e-05, 0.90280455, 0.0031091773, 0.0042484328, 0.00012638989, 0.05069441, 9.7391217e-05, 0.038894258]

I've followed the OpenCV documentation : https://docs.opencv.org/4.x/dd/d55/pytorch_cls_c_tutorial_dnn_conversion.html

System Info

PyTorch version: 1.10.2+cu113
OpenCV version: 4.5.1
Python version: 3.9.7
OS: Windows

Jun 20 '22 13:06 Ahmed-Fayed

I have run into the same issue. It makes validating my inference code based on opencv a real hassle.

We are not the first people running into this. It seems like the resizing algorithm in blobFromImage is slightly different from the one used in pytorch and others. 😫

https://answers.opencv.org/question/231345/cv2resize-is-different-from-scikit-image-resize/

It really sucks.

Jun 26 '22 20:06 thoj

I have a similar issue. Besides,I found that when I run my model in opencv-python the predicted results are correct. However, when it was ran in c++, the result was incorrect.

Jul 06 '22 02:07 pitaohc

Found some more info here: https://github.com/python-pillow/Pillow/issues/2718#issuecomment-333669547

Jul 07 '22 21:07 thoj

Just for fun i tried to use resize before blobFromImage with cubic interpolation. With cubic i get very close to my pytorch inference scores. cv is off by 0.005-0.01 instead of 0.02-0.04 on max label.

Just to make it clear. I only want to mach this 100% to confirm that my production inference code is correct. A 2-5% confidence error by itself is not the issue.

Jul 07 '22 22:07 thoj

Do you have a conclusion on this? I found that model trained with PIL and eval with opencv have a low accuracy. It seems that the performance drop is brought by interpolateion. Do you know how to make opencv performance align with PIL?

Jul 23 '22 14:07 CoinCheung

I started testing on a different model of mine and it all fell apart again. This model is not normalized so it should be even easier to get the same result. The accuracy was way off and a ton of miss predictions. This model is much tighter and the images has almost no color.

After using way to many hours on this problem i figured out that the model was trained with anti aliasing on. (Default on Pytorch transform) Open CV does not use anti aliasing by default. Something close can be achieved with INTER_AREA.

Example: Pytorch after resize transform: test