opencv icon indicating copy to clipboard operation
opencv copied to clipboard

Replicate PyTorch transforms like transforms.ToTensor() and transforms.Normalize() in opencv c++

Open Ahmed-Fayed opened this issue 3 years ago • 7 comments
trafficstars

Issue description

In my development code, I have pytorch transforms on my test dataset (torchvision.transforms), I have exported the model into ONNX and I want to replicate these transforms with OpenCV c++. but, when i make predictions using pytorch (with the exported ONNX model), I get different result than when making predictions with OpenCV c++

PyTorch transforms


test_transforms = transforms.Compose([
                           transforms.Resize(pretrained_size),
                           transforms.CenterCrop(pretrained_size),
                           transforms.ToTensor(),
                           transforms.Normalize(mean = pretrained_means, 
                                                std = pretrained_stds)
                       ])

  • Output of Inference with PyTorch:
pytorch image type: torch.float32
detected class: cross-over
confidence: 0.8553193807601929
output probabilities of softmax layer: [[2.9558592e-05 8.5531938e-01 6.2924426e-04 1.1440608e-02 5.4786936e-04
  4.9833752e-02 2.7969838e-04 8.1919849e-02]]

My Inference with OpenCV c++

    py::dict VClassdict;

    // Convert input Image from Python Code to Numpy Array
    Mat frame = nparray_to_mat(img);

    // Color Conversion to RGB
    Mat Img;
    //cvtColor(frame, Img, COLOR_BGR2RGB);

    // Resize Image to ResNet Input size
    cv::resize(frame, Img, Size(ClassinpWidth, ClassinpWidth));

    Mat normImg;
    Img.convertTo(normImg, CV_32FC3, 1.f / 255);
    Scalar mean;
    Scalar std;
    cv::meanStdDev(normImg, mean, std);

    mean[0] *= 255.0;
    mean[1] *= 255.0;
    mean[2] *= 255.0;

    double scale_factor = 0.003921569;  // equivalent to 1/255
   // Scalar mean = Scalar(117.8865, 128.52, 137.0115); // each channel mean is multiplied by 255.0
    //Scalar std = Scalar(0.2275, 0.2110, 0.2140);
    bool swapRB = true;
    bool crop = false;

    Mat blob;
    cv::dnn::blobFromImage(Img, blob, scale_factor, Size(ClassinpWidth, ClassinpWidth), mean, swapRB, crop);

    if (std.val[0] != 0.0 && std.val[1] != 0.0 && std.val[2] != 0.0)
    {
        // Divide blob by std.
        divide(blob, std, blob);
    }

    VClassResNet_Net.setInput(blob);

    // predict
    Mat prob = VClassResNet_Net.forward();

    cout << "output probabilities of softmax layer: " << prob;
    cout << endl;

    // extract prediction with highest confidence
    Point classIdPoint;
    double confidence;
    minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
    int classId = classIdPoint.x;

    // Setup Dict returned to Python code for detections
    VClassdict["ObjectClass"] = VClassResNet_classes[classId].c_str(); // Detected Class
    VClassdict["Confidence"] = confidence; // Confedence Level
  • Output of OpenCV c++ Inference:
detected class: cross-over
confidence: 0.9028045535087585
output probabilities of softmax layer: [2.5416075e-05, 0.90280455, 0.0031091773, 0.0042484328, 0.00012638989, 0.05069441, 9.7391217e-05, 0.038894258]

I've followed the OpenCV documentation : https://docs.opencv.org/4.x/dd/d55/pytorch_cls_c_tutorial_dnn_conversion.html

System Info

  • PyTorch version: 1.10.2+cu113
  • OpenCV version: 4.5.1
  • Python version: 3.9.7
  • OS: Windows

Ahmed-Fayed avatar Jun 20 '22 13:06 Ahmed-Fayed

I have run into the same issue. It makes validating my inference code based on opencv a real hassle.

We are not the first people running into this. It seems like the resizing algorithm in blobFromImage is slightly different from the one used in pytorch and others. 😫

https://answers.opencv.org/question/231345/cv2resize-is-different-from-scikit-image-resize/

It really sucks.

thoj avatar Jun 26 '22 20:06 thoj

I have a similar issue. Besides,I found that when I run my model in opencv-python the predicted results are correct. However, when it was ran in c++, the result was incorrect.

pitaohc avatar Jul 06 '22 02:07 pitaohc

Found some more info here: https://github.com/python-pillow/Pillow/issues/2718#issuecomment-333669547

thoj avatar Jul 07 '22 21:07 thoj

Just for fun i tried to use resize before blobFromImage with cubic interpolation. With cubic i get very close to my pytorch inference scores. cv is off by 0.005-0.01 instead of 0.02-0.04 on max label.

Just to make it clear. I only want to mach this 100% to confirm that my production inference code is correct. A 2-5% confidence error by itself is not the issue.

thoj avatar Jul 07 '22 22:07 thoj

Do you have a conclusion on this? I found that model trained with PIL and eval with opencv have a low accuracy. It seems that the performance drop is brought by interpolateion. Do you know how to make opencv performance align with PIL?

CoinCheung avatar Jul 23 '22 14:07 CoinCheung

I started testing on a different model of mine and it all fell apart again. This model is not normalized so it should be even easier to get the same result. The accuracy was way off and a ton of miss predictions. This model is much tighter and the images has almost no color.

After using way to many hours on this problem i figured out that the model was trained with anti aliasing on. (Default on Pytorch transform) Open CV does not use anti aliasing by default. Something close can be achieved with INTER_AREA.

Example: Pytorch after resize transform: test

OpenCV Output after Mat::Resize: classify_testmodelv5__-converted_Linear

OpenCV output with interpolation INTER_AREA classify_testmodelv5__-converted

It's not perfect there is still >0.05 prediction differences on like 2-3% of images. Some are as high as 0.2!

thoj avatar Aug 03 '22 09:08 thoj

Example Pytorch default resize: test

OpenCV Inter_AREA: classify_testmodelv5__-converted

So the differences are quite large even visually.

The file size for the pytorch PNG is 84kb while the OpenCV one is 101kb! So the Pytorch image is way softer.

thoj avatar Aug 03 '22 09:08 thoj