TensorFlowASR icon indicating copy to clipboard operation
TensorFlowASR copied to clipboard

Quant tflite for conformer tflite model

Open nyadla-sys opened this issue 3 years ago • 3 comments

I have added below code to ~/TensorFlowASR/examples/conformer/inference/gen_tflite_model.py file.

print("Started tflite quant conversion")
def representative_dataset():
    for _ in range(100):
      data = np.random.rand(1, 2,1, 320)
      yield [data.astype(np.float32)]

converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [
  tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
  tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()      
with open('/content/test/test_quant.tflite', "wb") as tflite_out:
    tflite_out.write(tflite_quant_model)

ran below command on colab setup !python3 /content/TensorFlowASR/examples/conformer/inference/gen_tflite_model.py --subwords --config /content/test/config.yml --h5 /content/test/latest.h5 --output /content/test/test.tflite

I see test.tflite float model gets generated, however for quant(int8) tflite model generation it throws the below errors

Started tflite quant conversion
2022-01-12 00:48:42.533268: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:357] Ignored output_format.
2022-01-12 00:48:42.533347: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored drop_control_dependency.
loc(fused["Einsum:", callsite("conformer_encoder/conformer_encoder/conformer_encoder_block_0/conformer_encoder_block_0_mhsa_module/conformer_encoder_block_0_mhsa_module_mhsa/einsum/Einsum"("/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py":1082:0) at callsite("/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py":150:0 at callsite("/content/TensorFlowASR/tensorflow_asr/models/layers/multihead_attention.py":118:0 at callsite("/content/TensorFlowASR/tensorflow_asr/models/layers/multihead_attention.py":283:0 at callsite("/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py":92:0 at callsite("/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py":1096:0 at callsite("/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py":64:0 at callsite("/content/TensorFlowASR/tensorflow_asr/models/encoders/conformer.py":139:0 at callsite("/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/operators/control_flow.py":1374:0 at "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/operators/control_flow.py":1321:0)))))))))]): error: 'tf.Einsum' op is neither a custom op nor a flex op
loc(fused["Einsum:", 
....
<unknown>:0: error: failed while converting: 'main': 
Some ops are not supported by the native TFLite runtime, you can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select 
TF Select ops: Einsum
Details:
	tf.Einsum(tensor<?x4x?x?xf32>, tensor<?x?x4x36xf32>) -> (tensor<?x?x4x36xf32>) : {device = "", equation = "...HNM,...MHI->...NHI"}
	tf.Einsum(tensor<?x?x144xf32>, tensor<4x144x36xf32>) -> (tensor<?x?x4x36xf32>) : {device = "", equation = "...MI,HIO->...MHO"}
	tf.Einsum(tensor<?x?x144xf32>, tensor<4x144x36xf32>) -> (tensor<?x?x4x36xf32>) : {device = "", equation = "...NI,HIO->...NHO"}
	tf.Einsum(tensor<?x?x4x36xf32>, tensor<1x?x4x36xf32>) -> (tensor<?x4x?x?xf32>) : {device = "", equation = "...NHO,...MHO->...HNM"}
	tf.Einsum(tensor<?x?x4x36xf32>, tensor<4x36x144xf32>) -> (tensor<?x?x144xf32>) : {device = "", equation = "...NHI,HIO->...NO"}
	tf.Einsum(tensor<?x?x4x36xf32>, tensor<?x?x4x36xf32>) -> (tensor<?x4x?x?xf32>) : {device = "", equation = "...NHO,...MHO->...HNM"}

nyadla-sys avatar Jan 12 '22 01:01 nyadla-sys

@usimarit Please help me on this . thanks in advance

nyadla-sys avatar Jan 12 '22 04:01 nyadla-sys

@usimarit also it may be good idea to add pretrained models to Github(like .h5 keras model,.tflite model and other required files to run these models using python script)

nyadla-sys avatar Jan 12 '22 04:01 nyadla-sys

@usimarit also it may be good idea to add pretrained models to Github(like .h5 keras model,.tflite model and other required files to run these models using python script)

Yes, its my request too, to ADD keras/saved model weights. @usimarit @monatis @vaibhav016

neso613 avatar Feb 21 '22 16:02 neso613

@neso613 @nyadla-sys github only stores the code, the pretrained models I shared on google drive.

@nyadla-sys Some ops are not supported to quant to int8, so we have to accept that (or wait until tf supports). You can try to convert tflite with any options as you want, but I’m not guarantee that it will convert successfully (except the default option of tflite conversion)

I’ll close the issue here. Feel free to reopen if you have further questions.

nglehuy avatar Sep 02 '22 05:09 nglehuy