tensorflow-onnx
tensorflow-onnx copied to clipboard
tflite vs ONNXRuntime accuracy
Describe the issue Given a benchmarked tflite model ( taking mobilenetv2 as an example), I tried converting it to ONNX using tf2onnx converter. Conversion wise the model seems to be fine. But running a simple forward pass on both tflite and onnx models, is giving a high MSE in the output layer. Am I doing something wrong while conversion / comparison. Attaching model and steps i followed to convert and comparison script.
tflite model: mobilenet_v2
convert to onnx: python3 -m tf2onnx.convert --opset 13 --tflite mobilenet_v2.tflite --output q_mobilenet_v2.onnx
To reproduce comparison script:
#!/bin/env python3
import onnx, tensorflow as tf, numpy as np, onnxruntime
TFLITE_FILE_PATH = 'mobilenet_v2.tflite'
ONNX_FILE_PATH = 'q_mobilenet_v2.onnx'
if __name__=='__main__':
interpreter = tf.lite.Interpreter(TFLITE_FILE_PATH)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape']
sess_options = onnxruntime.SessionOptions()
sess_options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL
session = onnxruntime.InferenceSession(ONNX_FILE_PATH, sess_options)
output_name = [n.name for n in session.get_outputs()]
input_node_name = [n.name for n in session.get_inputs()]
# compare on 10 random data
for i in range(10):
input_data = np.ones(shape=(input_shape), dtype=np.uint8)#np.array(np.random.randint(0,255,size=input_shape, dtype=np.uint8))
print(input_details[0]['index'])
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
o_tfl = interpreter.get_tensor(output_details[0]['index'])
o_onnx = session.run(output_name, {'input':input_data})[0]
diff = abs(o_tfl - o_onnx)
MSE = np.square(diff).mean()
E_AMAX = diff.max()
E_PAMX = E_AMAX/np.abs(o_onnx).mean()
print(f"output_tfl: {o_tfl}\noutput_onnx:{o_onnx}\nMSE: {MSE}\nE_AMAX: {E_AMAX}\nE_PAMX: {E_PAMX}\n")
print('___________________________\n')
Urgency I'm looking at accuracy of a QAT pytorch model -> ONNX and how much accuracy will I be able to retain.
Platform Linux
OS Version 4.18.0
ONNX Runtime Installation Released Package
ONNX Runtime Version or Commit ID 1.12.1
ONNX Runtime API Python
Architecture X86
Execution Provider Default CPU
We usually use the code below to compare the results between tf and onnxruntime:
np.testing.assert_allclose(o_onnx, o_tfl, rtol=1e-04, atol=1e-03)
This seems like a bug and will take a further investigation.
@fatcat-z if possible, I can help in further debug for this, if it's a good-first-issue, any pointers / suggestions you have to what can be the underlying issue / where to look at ?
Looking into it, seems like it is relative to the quantization of input.
@adityat77 ,
The given tflite model is a little different than the normal one. We didn't find a Quantize op in it, and the inputs itself contains the quantization details.
So far, tf2onnx doesn't support this type well yet. Could you please try another way to convert the original TF model to a tflite one?
@fatcat-z the model I'm trying is from tflite modelzoo.
From my understanding the issue is that inputs are already in uint8 format, whereas if it were coming in float32, the usual way which we're referring here, will work better/fine.
Please ket me know if my understanding of the problem is correct. Or can you point to some way to convert a tensorflow model to tflite.
Thanks!
@fatcat-z Jay Zhang FTE the model I'm trying is from tflite modelzoo. From my understanding the issue is that inputs are already in
uint8format, whereas if it were coming infloat32, the usual way which we're referring here, will work better/fine. Please ket me know if my understanding of the problem is correct. Or can you point to some way to convert a tensorflow model to tflite.Thanks!
Yes, that's correct.
thanks for the help! closing this issue, if possible do we have this feature in pipeline, you can list what needs to be done and open a FR and assign it to me. Will be happy to contribute.
reopening this: context:
on exporting a tf -> mobilenetv1 model to tflite -> converting to onnx, seeing MSE in layeroutput.
inputs are in float32 format, output is int8. Edit: attached model which i used for comparison. The script for comparison remains the same.
MSE: 0.01098901098901099
E_AMAX: 255
E_PAMX: 1492.719298245614
reopening this: context:
on exporting a tf -> mobilenetv1 model to tflite -> converting to onnx, seeing MSE in layeroutput.
inputs are in float32 format, output is int8. Edit: attached model which i used for comparison. The script for comparison remains the same.
MSE: 0.01098901098901099 E_AMAX: 255 E_PAMX: 1492.719298245614
In your latest model, I noticed the last node is Quantize, and there is no Dequantize after that. Is this a requirement of your model or just a test for something?
In tf2onnx, we always handle Quantize and Dequantize as a pair, so this case is unusual.
no, I was just testing this flow, not a strong requirement as such, thanks for looking into it !