Choosing the layer to retrieve embeddings from: allocate_tensors error, when using tensorflow 2.17.0
Describe the bug When I use the tflite Interpreter object, I used to be able to get both embeddings and class logits in the same way as done in this code base:
To Reproduce I have tensorflow 2.17. I actually tried with tensorflow 2.15 and it worked fine, but I need to use tensorflow 2.16 or 2.17 for compatibility with my CUDA version 12.3 re this table https://www.tensorflow.org/install/source#gpu
# first downloaded checkpoint https://github.com/kahst/BirdNET-Analyzer/blob/v1.5.0/birdnet_analyzer/checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_Model_FP16.tflite
tf_model = tflite.Interpreter(
model_path=model_path, num_threads=num_tflite_threads
)
input_details = tf_model.get_input_details()[0]
input_layer_idx = input_details["index"]
# choose which layer should be retrieved
output_details = tf_model.get_output_details()[0]
embedding_idx = output_details["index"] - 1
class_logits_idx = output_details["index"]
# forward pass
# we need to reshape the format of expected input tensor for TF model to include batch dimension
tf_model.resize_tensor_input(
input_layer_idx, [len(batch_data), *batch_data[0].shape]
)
tf_model.allocate_tensors()
tf_model.set_tensor(input_layer_idx, np.float32(batch_data))
tf_model.invoke() # forward pass
# get results
batch_logits =tf_model.get_tensor(class_logits_idx) # no error, returns logits
batch_embeddings =tf_model.get_tensor(embedding_idx) # results in `ValueError: Tensor data is null. Run allocate_tensors() first`
Expected behavior Can retrieve embedding (feature) values from output_details['index']-1 layer using tf_model.get_tensor()
How do I need to modify the code so that the tflite interpreter can be used with TensorFlow 2.17? Thanks
You answered your own question, we currently only support TF 2.15.x, there are breaking changes in TF 2.16 that affect our model loading and saving. So yes there are changes needed to support TF 2.17 and it is on our roadmap for next year, but if you implement a solution, feel free to create a PR
I would like to help to solve the problem, as I'm having the same issue while extending the API, but downgrading my Linux development environment from Python 3.12, which doesn't support this outdated TF version, seems quite complicated. Unfortunately, I have no idea where to start, so perhaps we should meet next year and talk about a solution, Josef.
Happy new year. :)
I'm back to work and played around with TensorFlow for some while. As it seems we are only using TensorFlow Lite, I did some research and discovered it got renamed to AI Edge LiteRT [1]. Unfortunately, only the nightly build [2] supports Python 3.12, but I was able to make it run and basically nothing changed, i.e. the error still persists. By digging through the API documentation, I found a possible solution, finally:
Just set the Interpreter-parameter [3] "experimental_preserve_all_tensors" to "True" and the embeddings can be extracted as before.
As this parameter exists (with default value of "False") since TensorFlow 2.5 [4], I'm wondering why it worked until 2.15. So there may be other incompatibilities, but it seems to work for me, so give it a try.
[1] https://ai.google.dev/edge/litert [2] https://pypi.org/project/ai-edge-litert-nightly/ [3] https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter [4] https://www.tensorflow.org/versions/r2.5/api_docs/python/tf/lite/Interpreter
Just set the Interpreter-parameter [3] "experimental_preserve_all_tensors" to "True" and the embeddings can be extracted as before.
Thanks! This works for me in Tensorflow 2.17 on Linux.
For future reference, in context this looks like:
tf_model=tflite.Interpreter(
model_path=model_path,
num_threads=num_tflite_threads,
experimental_preserve_all_tensors=True,
)
tf_model.allocate_tensors()
Recently, I did some more research and tested different python packages (tflite-runtime, tensorflow-cpu and ai-edge-litert) in different versions to finally answer my open questions from the beginning of the year.
The documentation of the already found Interpreter-parameter "experimental_preserve_all_tensors" hints, that this also sets the operation resolver from "OpResolverType.BUILTIN" (which is chosen by default "OpResolverType.AUTO") to "OpResolverType.BUILTIN_WITHOUT_DEFAULT_DELEGATES" to avoid the delegate invalidating the memory of intermediate tensors.
So I tried it and set the Interpreter-parameter "experimental_op_resolver_type" to "OpResolverType.BUILTIN_WITHOUT_DEFAULT_DELEGATES" by leaving the other parameter "experimental_preserve_all_tensors" at default value "False". This leads to be able to retrieve embeddings in all versions and saves memory compared to the other option.
The reason, why old versions of the Lite API of TensorFlow are able to retrieve the embeddings without deactivating the default delegate using this parameter, is that the old versions don't have a delegate which works with our BirdNET model.
Taking your example code, this looks like:
import ai_edge_litert.interpreter as tflite
tf_model=tflite.Interpreter(
model_path=model_path,
num_threads=num_tflite_threads,
experimental_op_resolver_type=tflite.OpResolverType.BUILTIN_WITHOUT_DEFAULT_DELEGATES,
)
tf_model.allocate_tensors()
Thanks, this works on my end