tensorflow-onnx tflite vs ONNXRuntime accuracy

Describe the issue Given a benchmarked tflite model ( taking mobilenetv2 as an example), I tried converting it to ONNX using tf2onnx converter. Conversion wise the model seems to be fine. But running a simple forward pass on both tflite and onnx models, is giving a high MSE in the output layer. Am I doing something wrong while conversion / comparison. Attaching model and steps i followed to convert and comparison script.

tflite model: mobilenet_v2

convert to onnx: python3 -m tf2onnx.convert --opset 13 --tflite mobilenet_v2.tflite --output q_mobilenet_v2.onnx

To reproduce comparison script:

#!/bin/env python3
import onnx, tensorflow as tf, numpy as np, onnxruntime

TFLITE_FILE_PATH = 'mobilenet_v2.tflite'
ONNX_FILE_PATH = 'q_mobilenet_v2.onnx'

if __name__=='__main__':
    interpreter = tf.lite.Interpreter(TFLITE_FILE_PATH)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()
    input_shape = input_details[0]['shape']

    sess_options = onnxruntime.SessionOptions()
    sess_options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL
    session = onnxruntime.InferenceSession(ONNX_FILE_PATH, sess_options)
    output_name = [n.name for n in session.get_outputs()]
    input_node_name = [n.name for n in session.get_inputs()]
    
    # compare on 10 random data 
    for i in range(10):
        input_data = np.ones(shape=(input_shape), dtype=np.uint8)#np.array(np.random.randint(0,255,size=input_shape, dtype=np.uint8))
        print(input_details[0]['index'])
        interpreter.set_tensor(input_details[0]['index'], input_data)
        interpreter.invoke()
        o_tfl = interpreter.get_tensor(output_details[0]['index'])
        o_onnx = session.run(output_name, {'input':input_data})[0]
        
        diff = abs(o_tfl - o_onnx)
        MSE = np.square(diff).mean()
        E_AMAX = diff.max()
        E_PAMX = E_AMAX/np.abs(o_onnx).mean()
        print(f"output_tfl: {o_tfl}\noutput_onnx:{o_onnx}\nMSE:    {MSE}\nE_AMAX: {E_AMAX}\nE_PAMX: {E_PAMX}\n")
        print('___________________________\n')

Urgency I'm looking at accuracy of a QAT pytorch model -> ONNX and how much accuracy will I be able to retain.

Platform Linux

OS Version 4.18.0

ONNX Runtime Installation Released Package

ONNX Runtime Version or Commit ID 1.12.1

ONNX Runtime API Python

Architecture X86

Execution Provider Default CPU

Sep 14 '22 09:09 adityat77

We usually use the code below to compare the results between tf and onnxruntime:

np.testing.assert_allclose(o_onnx, o_tfl, rtol=1e-04, atol=1e-03)

This seems like a bug and will take a further investigation.

Sep 15 '22 10:09 fatcat-z

@fatcat-z if possible, I can help in further debug for this, if it's a good-first-issue, any pointers / suggestions you have to what can be the underlying issue / where to look at ?

Sep 16 '22 04:09 adityat77

Looking into it, seems like it is relative to the quantization of input.

Sep 16 '22 11:09 fatcat-z

@adityat77 ,

The given tflite model is a little different than the normal one. We didn't find a Quantize op in it, and the inputs itself contains the quantization details.

So far, tf2onnx doesn't support this type well yet. Could you please try another way to convert the original TF model to a tflite one?

Sep 19 '22 06:09 fatcat-z

@fatcat-z the model I'm trying is from tflite modelzoo. From my understanding the issue is that inputs are already in uint8 format, whereas if it were coming in float32, the usual way which we're referring here, will work better/fine. Please ket me know if my understanding of the problem is correct. Or can you point to some way to convert a tensorflow model to tflite.

Thanks!

Sep 19 '22 07:09 adityat77

@fatcat-z Jay Zhang FTE the model I'm trying is from tflite modelzoo. From my understanding the issue is that inputs are already in uint8 format, whereas if it were coming in float32, the usual way which we're referring here, will work better/fine. Please ket me know if my understanding of the problem is correct. Or can you point to some way to convert a tensorflow model to tflite.

Thanks!

Yes, that's correct.

Sep 19 '22 15:09 fatcat-z

thanks for the help! closing this issue, if possible do we have this feature in pipeline, you can list what needs to be done and open a FR and assign it to me. Will be happy to contribute.

Sep 20 '22 06:09 adityat77

test_mnetv1.zip

reopening this: context:

on exporting a tf -> mobilenetv1 model to tflite -> converting to onnx, seeing MSE in layeroutput.

inputs are in float32 format, output is int8. Edit: attached model which i used for comparison. The script for comparison remains the same.

MSE:    0.01098901098901099
E_AMAX: 255
E_PAMX: 1492.719298245614

Sep 20 '22 10:09 adityat77

test_mnetv1.zip

reopening this: context:

on exporting a tf -> mobilenetv1 model to tflite -> converting to onnx, seeing MSE in layeroutput.

inputs are in float32 format, output is int8. Edit: attached model which i used for comparison. The script for comparison remains the same.
MSE:    0.01098901098901099
E_AMAX: 255
E_PAMX: 1492.719298245614

In your latest model, I noticed the last node is Quantize, and there is no Dequantize after that. Is this a requirement of your model or just a test for something?

In tf2onnx, we always handle Quantize and Dequantize as a pair, so this case is unusual.

Sep 26 '22 12:09 fatcat-z

no, I was just testing this flow, not a strong requirement as such, thanks for looking into it !

Sep 27 '22 05:09 adityat77

tensorflow-onnx tensorflow-onnx copied to clipboard

tflite vs ONNXRuntime accuracy

tensorflow-onnx
tensorflow-onnx copied to clipboard