Model Output Disparity
Model Output Disparity for MNN model converted from ONNX
Converted model mosaic.mnn from mosaic.onnx using the following command
mnnconvert -f ONNX --modelFile mosaic.onnx --MNNModel mosaic.mnn
Test script->
import MNN
mnn_model = 'mosaic.mnn'
net = MNN.nn.load_module_from_file(mnn_model,[],[]);
input=MNN.numpy.ones([1,3,224,224],dtype=MNN.numpy.float32)
input=MNN.expr.convert(input,MNN.expr.NC4HW4)
output=net.forward(input)
output = MNN.expr.convert(output,MNN.expr.NCHW)
print("######################## MNN Output##############################")
print(output)
print('#################################################################')
print("")
import onnxruntime as ort
import numpy as np
onnx_model='mosaic.onnx'
model= ort.InferenceSession('mosaic.onnx')
output=model.run(None,{'input1':np.ones([1,3,224,224],dtype=np.float32)})
print("######################## ONNX Output##############################")
print(output)
print('#################################################################')
The outputs of both the models differ on ones as input.
Test Setup MNN version==2.8.1 onnxruntime version==1.17.0 x86_64 architecture Linux machine python3.10.12
Could someone help me with why these outputs are differing? I am trying to shift from onnx to mnn for the performance as well binary size benefits.
Attaching both the models as well as above script as zip for reference mosaic.zip
Use testMNNFromOnnx.py to test firstly.
I have ran testMNNFromOnnx.py. The test passes when given random input np.random.uniform(0, 12, shapes). When I edited the script to take np.zeros(shapes) as input, the test fails.
On a side note, this mosaic model is a standard model taken from https://github.com/onnx/models/tree/main/validated/vision/style_transfer/fast_neural_style
hey @jxt1234 is someone looking into this? I would really appreciate the help :)
testMNNFromOnnx.py TEST_SUCCESS means MNN is right. You can convert MNN with --keepInputFormat and don't call MNN.expr.convert(input,MNN.expr.NC4HW4).
testMNNFromOnnx.py TEST_SUCCESS means MNN is right.
I feel we are having a misunderstanding, What do you mean by this exactly?
I am giving you an example mosaic.onnx and mosaic.mnn with a particular input not matching.
This does not work even after incorporating your suggestions You can convert MNN with --keepInputFormat and don't call MNN.expr.convert(input,MNN.expr.NC4HW4).
Steps to reproduce the issue:
- This script testMNNFromOnnx.py takes random input into the model, and that passes the test and gives TEST_SUCCESS.
- When we edit the script to take
np.zeros(shape)as input the test itself fails.
@jxt1234 could you try and reproduce this issue using the two steps mentioned above. Attaching the edited testMNNFromOnnx.py for easy reproducibility. Added a comment on line 153 # EDITED HERE so that the model takes zeros as input where I have edited the script to take zeros as input.
For your case np.zeros / np.ones will cause input data is all same. So var is very small. It will cause sqrt function Disparity. It's not common case. Please use real input data to test.
There is also another disparity we are observing. @jxt1234 The testScript shared is giving very different results on Mac and Linux machines on np.ones and np.zeros.
Test Setup of Linux Machine MNN version==2.8.1 onnxruntime version==1.17.0 x86_64 architecture Linux machine python3.10.12
Test Setup of Mac Machine MNN version==2.8.1 onnxruntime version==1.17.0 Mac M1 chip (2020) python3.7.16
Output on Linux machine
######################## MNN Output##############################
array([[[[ 1.7789851e+02, 1.7789816e+02, 1.7789783e+02, ...,
1.7790385e+02, 1.7790039e+02, 1.7790286e+02],
[ 1.7789778e+02, 1.7789758e+02, 1.7789714e+02, ...,
1.7790488e+02, 1.7790128e+02, 1.7790410e+02],
[ 1.7789806e+02, 1.7789786e+02, 1.7789740e+02, ...,
1.7790530e+02, 1.7790167e+02, 1.7790437e+02],
...,
[ 1.7787030e+02, 1.7786987e+02, 1.7786964e+02, ...,
-1.8242943e+01, -3.4454234e+00, 9.5861053e+00],
[ 1.7786783e+02, 1.7786771e+02, 1.7786725e+02, ...,
-3.3775089e+01, -2.9153913e+01, -3.9820961e+01],
[ 1.7787071e+02, 1.7787071e+02, 1.7787039e+02, ...,
-7.8822029e+01, -7.9405579e+01, -9.0173950e+01]],
[[ 1.6287909e+02, 1.6287906e+02, 1.6287897e+02, ...,
1.6287405e+02, 1.6287401e+02, 1.6287613e+02],
[ 1.6287944e+02, 1.6287935e+02, 1.6287930e+02, ...,
1.6287408e+02, 1.6287430e+02, 1.6287608e+02],
[ 1.6287997e+02, 1.6288005e+02, 1.6287979e+02, ...,
1.6287399e+02, 1.6287418e+02, 1.6287605e+02],
...,
[ 1.6285625e+02, 1.6285625e+02, 1.6285588e+02, ...,
-4.3152927e+01, -2.9389851e+01, -3.3526914e+00],
[ 1.6285176e+02, 1.6285170e+02, 1.6285165e+02, ...,
2.2175953e+01, 2.8807076e+01, 3.5963924e+01],
[ 1.6285019e+02, 1.6284982e+02, 1.6284982e+02, ...,
2.5776017e+00, -6.1696470e-03, 5.3866558e+00]],
[[ 1.5353537e+02, 1.5353502e+02, 1.5353499e+02, ...,
1.5352097e+02, 1.5351585e+02, 1.5351753e+02],
[ 1.5353632e+02, 1.5353571e+02, 1.5353595e+02, ...,
1.5352025e+02, 1.5351535e+02, 1.5351712e+02],
[ 1.5353517e+02, 1.5353476e+02, 1.5353500e+02, ...,
1.5352109e+02, 1.5351575e+02, 1.5351735e+02],
...,
[ 1.5354117e+02, 1.5354056e+02, 1.5354077e+02, ...,
2.3128716e+01, 4.9280975e+01, 6.8965607e+01],
[ 1.5353847e+02, 1.5353783e+02, 1.5353815e+02, ...,
7.4308746e+01, 8.9900993e+01, 8.4358002e+01],
[ 1.5353415e+02, 1.5353368e+02, 1.5353375e+02, ...,
2.3612188e+01, 2.1621935e+01, 1.5537018e+01]]]],
dtype=float32)
#################################################################
######################## ONNX Output##############################
[array([[[[171.34421, 171.34421, 171.34421, ..., 171.34421, 171.34421,
171.34421],
[171.34421, 171.34421, 171.34421, ..., 171.34421, 171.34421,
171.34421],
[171.34421, 171.34421, 171.34421, ..., 171.34421, 171.34421,
171.34421],
...,
[171.34421, 171.34421, 171.34421, ..., 171.34421, 171.34421,
171.34421],
[171.34421, 171.34421, 171.34421, ..., 171.34421, 171.34421,
171.34421],
[171.34421, 171.34421, 171.34421, ..., 171.34421, 171.34421,
171.34421]],
[[157.65192, 157.65192, 157.65192, ..., 157.65192, 157.65192,
157.65192],
[157.65192, 157.65192, 157.65192, ..., 157.65192, 157.65192,
157.65192],
[157.65192, 157.65192, 157.65192, ..., 157.65192, 157.65192,
157.65192],
...,
[157.65192, 157.65192, 157.65192, ..., 157.65192, 157.65192,
157.65192],
[157.65192, 157.65192, 157.65192, ..., 157.65192, 157.65192,
157.65192],
[157.65192, 157.65192, 157.65192, ..., 157.65192, 157.65192,
157.65192]],
[[147.90988, 147.90988, 147.90988, ..., 147.90988, 147.90988,
147.90988],
[147.90988, 147.90988, 147.90988, ..., 147.90988, 147.90988,
147.90988],
[147.90988, 147.90988, 147.90988, ..., 147.90988, 147.90988,
147.90988],
...,
[147.90988, 147.90988, 147.90988, ..., 147.90988, 147.90988,
147.90988],
[147.90988, 147.90988, 147.90988, ..., 147.90988, 147.90988,
147.90988],
[147.90988, 147.90988, 147.90988, ..., 147.90988, 147.90988,
147.90988]]]], dtype=float32)]
#################################################################
Output on Mac Machine
######################## MNN Output##############################
array([[[[250.21507 , 250.16147 , 249.76387 , ..., 58.233986,
59.66561 , 51.944107],
[249.54218 , 249.3817 , 248.97392 , ..., 58.83722 ,
64.62019 , 58.525745],
[249.11377 , 249.11868 , 248.49596 , ..., 53.54505 ,
62.351784, 56.883274],
...,
[ 90.75204 , 85.10422 , 77.92067 , ..., 58.6197 ,
88.46058 , 74.92694 ],
[139.21458 , 137.08766 , 132.7679 , ..., 123.57399 ,
149.02847 , 110.15517 ],
[135.13843 , 134.24225 , 134.49554 , ..., 114.151 ,
144.84058 , 116.4156 ]],
[[250.7712 , 251.45166 , 251.51967 , ..., 116.97559 ,
107.31926 , 111.61564 ],
[249.24142 , 249.67822 , 249.77678 , ..., 116.61871 ,
110.15776 , 116.610756],
[247.57831 , 248.08623 , 247.88712 , ..., 116.43434 ,
113.84082 , 120.039795],
...,
[129.85591 , 127.12337 , 119.98896 , ..., 83.45971 ,
100.48431 , 93.10752 ],
[130.522 , 124.07818 , 122.81633 , ..., 79.169044,
96.53803 , 79.26343 ],
[124.85134 , 121.22441 , 123.81856 , ..., 93.23835 ,
112.23832 , 103.228325]],
[[303.78763 , 303.88348 , 303.10953 , ..., 171.72421 ,
168.28552 , 164.99461 ],
[301.4203 , 301.25974 , 300.55664 , ..., 172.78957 ,
172.58192 , 170.90668 ],
[298.97623 , 298.9938 , 298.28656 , ..., 169.06041 ,
172.94447 , 174.88399 ],
...,
[120.82464 , 112.478584, 100.644356, ..., 90.14767 ,
119.52492 , 128.43115 ],
[144.41287 , 137.94812 , 133.33185 , ..., 128.82146 ,
163.61607 , 151.649 ],
[153.434 , 149.53357 , 152.06706 , ..., 156.95384 ,
189.02585 , 174.62895 ]]]], dtype=float32)
#################################################################
######################## ONNX Output##############################
[array([[[[170.95201, 170.95201, 170.95201, ..., 170.95201, 170.95201,
170.95201],
[170.95201, 170.95201, 170.95201, ..., 170.95201, 170.95201,
170.95201],
[170.95201, 170.95201, 170.95201, ..., 170.95201, 170.95201,
170.95201],
...,
[170.95201, 170.95201, 170.95201, ..., 170.95201, 170.95201,
170.95201],
[170.95201, 170.95201, 170.95201, ..., 170.95201, 170.95201,
170.95201],
[170.95201, 170.95201, 170.95201, ..., 170.95201, 170.95201,
170.95201]],
[[157.96027, 157.96027, 157.96027, ..., 157.96027, 157.96027,
157.96027],
[157.96027, 157.96027, 157.96027, ..., 157.96027, 157.96027,
157.96027],
[157.96027, 157.96027, 157.96027, ..., 157.96027, 157.96027,
157.96027],
...,
[157.96027, 157.96027, 157.96027, ..., 157.96027, 157.96027,
157.96027],
[157.96027, 157.96027, 157.96027, ..., 157.96027, 157.96027,
157.96027],
[157.96027, 157.96027, 157.96027, ..., 157.96027, 157.96027,
157.96027]],
[[151.1252 , 151.1252 , 151.1252 , ..., 151.1252 , 151.1252 ,
151.1252 ],
[151.1252 , 151.1252 , 151.1252 , ..., 151.1252 , 151.1252 ,
151.1252 ],
[151.1252 , 151.1252 , 151.1252 , ..., 151.1252 , 151.1252 ,
151.1252 ],
...,
[151.1252 , 151.1252 , 151.1252 , ..., 151.1252 , 151.1252 ,
151.1252 ],
[151.1252 , 151.1252 , 151.1252 , ..., 151.1252 , 151.1252 ,
151.1252 ],
[151.1252 , 151.1252 , 151.1252 , ..., 151.1252 , 151.1252 ,
151.1252 ]]]], dtype=float32)]
#################################################################
The difference in outputs of the onnx models between two devices could be attributed to the sqrt function disparity. But the difference in the result of MNN models is huge, do we attribute this large change to the sqrt function?
Is the implementation of MNN runtime so different for Arm and x86 that it is expected to see such behaviour? Because the numbers being so different feels like that I am not using it correctly, or there is a bug? Attaching the testScript again for reference
import MNN
mnn_model = 'mosaic.mnn'
net = MNN.nn.load_module_from_file(mnn_model,[],[]);
input=MNN.numpy.zeros([1,3,224,224],dtype=MNN.numpy.float32)
#input=MNN.expr.convert(input,MNN.expr.NC4HW4)
output=net.forward(input)
output = MNN.expr.convert(output,MNN.expr.NCHW)
print("######################## MNN Output##############################")
print(output)
print('#################################################################')
print("")
import onnxruntime as ort
import numpy as np
onnx_model='mosaic.onnx'
model= ort.InferenceSession('mosaic.onnx')
output=model.run(None,{'input1':np.zeros([1,3,224,224],dtype=np.float32)})
print("######################## ONNX Output##############################")
print(output)
print('#################################################################')
np.zeros() and np.ones() isn't valid input for this model because of instancenorm. It will result to compute sqrt(0.0 + 0.000000000001) , thus cause [Disparity]
Marking as stale. No activity in 60 days.