tflite-micro Passing custom/additional data to kernels

Passing custom/additional data to kernels

Open DanieleParravicini-Synthara opened this issue 9 months ago • 2 comments

Hello, I have the following use case that I would like to cover.

How can I provide an efficient kernel for a layer without breaking the compatibility of the model?

Consider:

a layer of a NN model (e.g. a linear layer) that could be accelerated by a dedicated hardware module. E.g. imagine you have M cores and you break the linear layers in M chunks.
the custom hardware module could benefit from additional information (e.g. how many elements each core could use).

Here follows some ideas:

use custom operator
1. for each layer that you want to accelerate create a custom operator
2. modify the NN model replacing the original layer with a custom operator that is functionally equivalent to the first one
3. retrain the NN model
4. convert the model to a TensorFlow Lite Model adding the additional custom operator (and possibly passing additional info to the new custom layer using custom_options field of Operator of flatbuffer)
5. provide a implementation of the operator to the interpreter that reads the custom_options field and execute the layer accordingly
keep the same operator and use a custom conversion tools that adds information on the custom_options field. In this way I can avoid the first three steps of the previous list and do the following
1. convert the model to a TensorFlow Lite Model adding the additional custom operator and passing additional info to the new custom layer using custom_options field of Operator of flatbuffer. e.g. like this

import np
import tensorflow as tf
from tensorflow.lite.python import schema_py_generated as schema_fb

def load_model(save_path: str):
    with open(save_path, "rb") as f:
       return f.read()

tflite_quantized = load_model("models/mlp_int8.tflite")

aModel = schema_fb.ModelT.InitFromPackedBuf(tflite_quantized, 0)
     

def BuiltinCodeToName(code):
  """Converts a builtin op code enum to a readable name."""
  for name, value in schema_fb.BuiltinOperator.__dict__.items():
    if value == code:
      return name
  return None
     

for i, op in enumerate(aModel.subgraphs[0].operators):
    op_code = aModel.operatorCodes[op.opcodeIndex].builtinCode
    print(f"[{i}] :  {BuiltinCodeToName(op_code)} ({op_code})")

     
### FROM HERE
custo = np.ones(10,dtype=np.uint8)
aModel.subgraphs[0].operators[4].customOptions = custo
### TO HERE

from tflite_support import flatbuffers
b = flatbuffers.Builder(0)

b.Finish(
    aModel.Pack(b)
)
model_buff = b.Output()

def save_tflite_model(tflite_model, save_dir, model_name):
	"""save the converted tflite model
	Args:
		tflite_model (binary): the converted model in serialized format.
		save_dir (str): the save directory
		model_name (str): model name to be saved
	"""
    import os

	if not os.path.exists(save_dir):
		os.makedirs(save_dir)
	save_path = os.path.join(save_dir, model_name)
	
	with open(save_path, "wb") as f:
		f.write(tflite_model)

save_tflite_model(model_buff, "MLP_models", "mlp_int8.tflite" )

I will have to modify the behaviour of microinterpreter to allow for builtin operators to have custom_options
provide a suitable implementation of the operator that uses the custom_options of the operator to do something smart

with the second approach I see the following advantages:

I do not have to modify the model. I can just write a python script that could look at the model and "annotate the layers with custom_options" when needed.
I have compatibility with the original model and can switch between accelerated and non accelerated kernels (e.g. in certain cases due to the fixed costs needed to start the dedicated hardware module the reference implementation or another operator implementation is better suited)
I lower the complexity of accelerating operators
No need to retrain the model from scratch

May 22 '24 14:05 DanieleParravicini-Synthara

tflite-micro tflite-micro copied to clipboard

Passing custom/additional data to kernels

tflite-micro
tflite-micro copied to clipboard