opencv
opencv copied to clipboard
Replicate PyTorch transforms like transforms.ToTensor() and transforms.Normalize() in opencv c++
Issue description
In my development code, I have pytorch transforms on my test dataset (torchvision.transforms), I have exported the model into ONNX and I want to replicate these transforms with OpenCV c++. but, when i make predictions using pytorch (with the exported ONNX model), I get different result than when making predictions with OpenCV c++
PyTorch transforms
test_transforms = transforms.Compose([
transforms.Resize(pretrained_size),
transforms.CenterCrop(pretrained_size),
transforms.ToTensor(),
transforms.Normalize(mean = pretrained_means,
std = pretrained_stds)
])
- Output of Inference with PyTorch:
pytorch image type: torch.float32
detected class: cross-over
confidence: 0.8553193807601929
output probabilities of softmax layer: [[2.9558592e-05 8.5531938e-01 6.2924426e-04 1.1440608e-02 5.4786936e-04
4.9833752e-02 2.7969838e-04 8.1919849e-02]]
My Inference with OpenCV c++
py::dict VClassdict;
// Convert input Image from Python Code to Numpy Array
Mat frame = nparray_to_mat(img);
// Color Conversion to RGB
Mat Img;
//cvtColor(frame, Img, COLOR_BGR2RGB);
// Resize Image to ResNet Input size
cv::resize(frame, Img, Size(ClassinpWidth, ClassinpWidth));
Mat normImg;
Img.convertTo(normImg, CV_32FC3, 1.f / 255);
Scalar mean;
Scalar std;
cv::meanStdDev(normImg, mean, std);
mean[0] *= 255.0;
mean[1] *= 255.0;
mean[2] *= 255.0;
double scale_factor = 0.003921569; // equivalent to 1/255
// Scalar mean = Scalar(117.8865, 128.52, 137.0115); // each channel mean is multiplied by 255.0
//Scalar std = Scalar(0.2275, 0.2110, 0.2140);
bool swapRB = true;
bool crop = false;
Mat blob;
cv::dnn::blobFromImage(Img, blob, scale_factor, Size(ClassinpWidth, ClassinpWidth), mean, swapRB, crop);
if (std.val[0] != 0.0 && std.val[1] != 0.0 && std.val[2] != 0.0)
{
// Divide blob by std.
divide(blob, std, blob);
}
VClassResNet_Net.setInput(blob);
// predict
Mat prob = VClassResNet_Net.forward();
cout << "output probabilities of softmax layer: " << prob;
cout << endl;
// extract prediction with highest confidence
Point classIdPoint;
double confidence;
minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
int classId = classIdPoint.x;
// Setup Dict returned to Python code for detections
VClassdict["ObjectClass"] = VClassResNet_classes[classId].c_str(); // Detected Class
VClassdict["Confidence"] = confidence; // Confedence Level
- Output of OpenCV c++ Inference:
detected class: cross-over
confidence: 0.9028045535087585
output probabilities of softmax layer: [2.5416075e-05, 0.90280455, 0.0031091773, 0.0042484328, 0.00012638989, 0.05069441, 9.7391217e-05, 0.038894258]
I've followed the OpenCV documentation : https://docs.opencv.org/4.x/dd/d55/pytorch_cls_c_tutorial_dnn_conversion.html
System Info
- PyTorch version: 1.10.2+cu113
- OpenCV version: 4.5.1
- Python version: 3.9.7
- OS: Windows
I have run into the same issue. It makes validating my inference code based on opencv a real hassle.
We are not the first people running into this. It seems like the resizing algorithm in blobFromImage is slightly different from the one used in pytorch and others. 😫
https://answers.opencv.org/question/231345/cv2resize-is-different-from-scikit-image-resize/
It really sucks.
I have a similar issue. Besides,I found that when I run my model in opencv-python the predicted results are correct. However, when it was ran in c++, the result was incorrect.
Found some more info here: https://github.com/python-pillow/Pillow/issues/2718#issuecomment-333669547
Just for fun i tried to use resize before blobFromImage with cubic interpolation. With cubic i get very close to my pytorch inference scores. cv is off by 0.005-0.01 instead of 0.02-0.04 on max label.
Just to make it clear. I only want to mach this 100% to confirm that my production inference code is correct. A 2-5% confidence error by itself is not the issue.
Do you have a conclusion on this? I found that model trained with PIL and eval with opencv have a low accuracy. It seems that the performance drop is brought by interpolateion. Do you know how to make opencv performance align with PIL?
I started testing on a different model of mine and it all fell apart again. This model is not normalized so it should be even easier to get the same result. The accuracy was way off and a ton of miss predictions. This model is much tighter and the images has almost no color.
After using way to many hours on this problem i figured out that the model was trained with anti aliasing on. (Default on Pytorch transform) Open CV does not use anti aliasing by default. Something close can be achieved with INTER_AREA.
Example:
Pytorch after resize transform:

OpenCV Output after Mat::Resize:

OpenCV output with interpolation INTER_AREA

It's not perfect there is still >0.05 prediction differences on like 2-3% of images. Some are as high as 0.2!
Example Pytorch default resize:

OpenCV Inter_AREA:

So the differences are quite large even visually.
The file size for the pytorch PNG is 84kb while the OpenCV one is 101kb! So the Pytorch image is way softer.