TensorFlowASR Quant tflite for conformer tflite model

I have added below code to ~/TensorFlowASR/examples/conformer/inference/gen_tflite_model.py file.

print("Started tflite quant conversion")
def representative_dataset():
    for _ in range(100):
      data = np.random.rand(1, 2,1, 320)
      yield [data.astype(np.float32)]

converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [
  tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
  tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()      
with open('/content/test/test_quant.tflite', "wb") as tflite_out:
    tflite_out.write(tflite_quant_model)

ran below command on colab setup !python3 /content/TensorFlowASR/examples/conformer/inference/gen_tflite_model.py --subwords --config /content/test/config.yml --h5 /content/test/latest.h5 --output /content/test/test.tflite

I see test.tflite float model gets generated, however for quant(int8) tflite model generation it throws the below errors

Started tflite quant conversion
2022-01-12 00:48:42.533268: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:357] Ignored output_format.
2022-01-12 00:48:42.533347: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored drop_control_dependency.
loc(fused["Einsum:", callsite("conformer_encoder/conformer_encoder/conformer_encoder_block_0/conformer_encoder_block_0_mhsa_module/conformer_encoder_block_0_mhsa_module_mhsa/einsum/Einsum"("/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py":1082:0) at callsite("/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py":150:0 at callsite("/content/TensorFlowASR/tensorflow_asr/models/layers/multihead_attention.py":118:0 at callsite("/content/TensorFlowASR/tensorflow_asr/models/layers/multihead_attention.py":283:0 at callsite("/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py":92:0 at callsite("/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py":1096:0 at callsite("/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py":64:0 at callsite("/content/TensorFlowASR/tensorflow_asr/models/encoders/conformer.py":139:0 at callsite("/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/operators/control_flow.py":1374:0 at "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/operators/control_flow.py":1321:0)))))))))]): error: 'tf.Einsum' op is neither a custom op nor a flex op
loc(fused["Einsum:", 
....
<unknown>:0: error: failed while converting: 'main': 
Some ops are not supported by the native TFLite runtime, you can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select 
TF Select ops: Einsum
Details:
	tf.Einsum(tensor<?x4x?x?xf32>, tensor<?x?x4x36xf32>) -> (tensor<?x?x4x36xf32>) : {device = "", equation = "...HNM,...MHI->...NHI"}
	tf.Einsum(tensor<?x?x144xf32>, tensor<4x144x36xf32>) -> (tensor<?x?x4x36xf32>) : {device = "", equation = "...MI,HIO->...MHO"}
	tf.Einsum(tensor<?x?x144xf32>, tensor<4x144x36xf32>) -> (tensor<?x?x4x36xf32>) : {device = "", equation = "...NI,HIO->...NHO"}
	tf.Einsum(tensor<?x?x4x36xf32>, tensor<1x?x4x36xf32>) -> (tensor<?x4x?x?xf32>) : {device = "", equation = "...NHO,...MHO->...HNM"}
	tf.Einsum(tensor<?x?x4x36xf32>, tensor<4x36x144xf32>) -> (tensor<?x?x144xf32>) : {device = "", equation = "...NHI,HIO->...NO"}
	tf.Einsum(tensor<?x?x4x36xf32>, tensor<?x?x4x36xf32>) -> (tensor<?x4x?x?xf32>) : {device = "", equation = "...NHO,...MHO->...HNM"}

Jan 12 '22 01:01 nyadla-sys

@usimarit Please help me on this . thanks in advance

Jan 12 '22 04:01 nyadla-sys

@usimarit also it may be good idea to add pretrained models to Github(like .h5 keras model,.tflite model and other required files to run these models using python script)

Jan 12 '22 04:01 nyadla-sys

@usimarit also it may be good idea to add pretrained models to Github(like .h5 keras model,.tflite model and other required files to run these models using python script)

Yes, its my request too, to ADD keras/saved model weights. @usimarit @monatis @vaibhav016

Feb 21 '22 16:02 neso613

@neso613 @nyadla-sys github only stores the code, the pretrained models I shared on google drive.

@nyadla-sys Some ops are not supported to quant to int8, so we have to accept that (or wait until tf supports). You can try to convert tflite with any options as you want, but I’m not guarantee that it will convert successfully (except the default option of tflite conversion)

I’ll close the issue here. Feel free to reopen if you have further questions.

Sep 02 '22 05:09 nglehuy

TensorFlowASR TensorFlowASR copied to clipboard

Quant tflite for conformer tflite model

TensorFlowASR
TensorFlowASR copied to clipboard