tflite-micro
tflite-micro copied to clipboard
MNIST_LSTM example case does not generate proper operators
The MNIST LSTM example does not compile a correct TFLite model with the UnidirectionalSequenceLSTM operator for TFLite Micro. Instead it generates a large graph with many operators.
Versions:
- Python 3.12.9 (also tested with 3.11.1)
- TensorFlow 2.18.0 (also tested with 2.16.1)
- Numpy 2.2.2
- Absl 2.1.0
- (Windows 10 22H2 and Ubuntu 20.04 LTS through Google Colab)
The problem arises when running the train.py script with new versions of TensorFlow. Whereas older versions (Python 3.10.x and TensorFlow 2.15.x) would generate the correct TFLite Micro operators, the newer versions output a whole graph of operators (see images below). This change leads to the need to resolve many more operators than before (additional ops: Concatenation, Gather, Less, LogicalAnd, Logistic, Mul, Slice, Split, Tanh and While). This subsequently makes the final compiled binary to be deployed larger as well.
| Original Graph | New Graph |
|---|---|
According to the new LiteRT RNN conversion page, the TFLite converter should "provide native support for standard TensorFlow RNN APIs like Keras LSTM". This sadly does not seem to work at the moment.
Is there any TensorFlow syntax that is missing in the example code, or can it be updated to include the needed versions of libraries or Python? Alternatively, can the example code be edited in order to support the new versions of the TensorFlow libraries and Python? Thanks in advance for any help!
After trying many combinations of Python- and library versions, the problem seems to start from TensorFlow 2.16.0. The combinations Python 3.10.x and 3.11.x with TensorFlow 2.15.0 and 2.15.1 do produce the expected model structure.
The Release Note for TensorFlow 2.16.0 does not list any breaking changes to the TensorFlow Lite converter.
"Keras 3 will be the default Keras version." is the breaking change.
Adding this to the train.py file will use Keras 2:
import os;os.environ["TF_USE_LEGACY_KERAS"]="1"
These are my requirement settings:
tensorflow==2.16.1
tf-keras>=2.16.0,<2.17.0
... and this is the graph:
@choas Thanks for the reply and suggested fix!
I hadn't spotted that change yet, but it explains a lot.
The updated code should then include (snippet from LSTM example train.py):
# [...]
import os
# Needed for compatibility, default is Keras 3 from Tensorflow 2.16 onwards, this reverts to using Keras 2
os.environ["TF_USE_LEGACY_KERAS"] = "1"
from absl import app
from absl import flags
from absl import logging
import numpy as np
import tensorflow as tf
# [...]
With the requirement that tf_keras is installed.
Now I just need some time to see if we can update the whole script to use the new Keras 3 API, as that would fix the issue properly.
[internal notes: script update for legacy Keras -or- test training with concrete function conversion instead of Keras TOCO conversion]