model-optimization loc("model/FPN/FPN/FPN//Concatenate_p_1/concat"): error: 'tfl.concatenation' op quantization parameters violate the same scale constraint: !quant.uniform<i8:f32, 9.9999999999999995E-7> vs. !quant.uniform<i8:f32, 4.9130543629871681E-5:-128>

1. System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
TensorFlow installation (pip package or built from source): pip
TensorFlow library (version, if pip package or github SHA, if built from source): 2.5.1

2. Code and initial problem

I am trying to quantize a model of which I do not want to show the code, however, I can share a non-quantized TFLite model. The node where the error happens is node 89.

import tensorflow as tf


model = create_model(...)
model.compile()

def representative_dataset_generator():
    for _ in range(20):
        yield [tf.random.uniform(shape=(1, IMAGE_SIZE, IMAGE_SIZE, 3), minval=-1, maxval=1,
                                 dtype=tf.float32)]


converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# If I comment these 5 lines, i.e. if I do not quantize the conversion goes through and this is the shared model
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter.representative_dataset = representative_dataset_generator

tflite_quant_model = converter.convert()

with open('model.tflite', 'wb') as file:
    file.write(tflite_quant_model)

3. Using the old converter and failure after conversion

If I use the old converter and old quantizer by using the additional lines below, the conversion goes through and I get this TFLite model.

converter.experimental_new_converter = False
converter.experimental_new_quantizer = False

But when I run the TFLite profiler on a phone with a DSP using the --use-nnapi=true, I get:

ERROR: NN API returned error ANEURALNETWORKS_OP_FAILED at line 4453 while running computation.
ERROR: Node number 177 (TfLiteNnapiDelegate) failed to invoke.

I tried to share as much relevant information as I could but don't hesitate to ask for additional information. Looking forward hearing from you. Thanks in advance!

Aug 17 '21 14:08 YannPourcenoux

Hi, there was a recent fix to the quantizer that prevented some issues with same scale constraint.

Could you try the conversion with tf-nightly?

Aug 23 '21 02:08 daverim

Hi, thank you for your answer. When I convert using tf-nightly==2.7.0-dev20210822, the conversion goes through and gives this tflite model. However, when running the benchmark I encounter the following error:

ERROR: NN API returned error ANEURALNETWORKS_OP_FAILED at line 4453 while running computation.

ERROR: Node number 161 (TfLiteNnapiDelegate) failed to invoke.

Do you have any idea what could be causing this problem?

Aug 23 '21 09:08 YannPourcenoux

I see the same error messages under the same condition running tfnightly==2.8.0-dev20211001

However, if do the same process without loading weights it works correctly.

I am using an eager-mode graph trained WITHOUT keras.fit

Oct 02 '21 00:10 lolz0r

Hello, I have the same problem, is there any way to solve this problem?

Feb 17 '22 06:02 zhaoxin111

Just to be clear, you should use the new_quantizer and new_converter with tf nightly for best results.

You seem to be running afoul of the NNAPI. You should try running the benchmark with --use_nnapi=false, as I'm not sure if the NNAPI supports quantized LSTM or other ops in your model. If you are running on mobile hardware, it is possible that you have an outdated or buggy NNAPI.

Feb 18 '22 07:02 daverim