LSTM Size Constraints
Description
It seems like there is a limit on supported LSTM sizes. I have a model that uses an LSTM on MobileNetV2 features extracted from video frames. The LSTM input shape is (5, 1280). While the model converts to TFLite and quantizes perfectly, the edgetpu_compiler fails without any information.
A toy example:
X = 100
model_input = tf.keras.Input((5, X), dtype=tf.float32)
lstm = tf.keras.layers.LSTM(32)
dense = tf.keras.layers.Dense(3, activation='softmax')
x = lstm(model_input)
x = dense(x)
model = tf.keras.Model(model_input, x)
model.input.set_shape((1,) + model.input.shape[1:])
print(model.summary())
def representative_dataset():
for i in range(10):
yield [np.random.random((1, 5, X)).astype(np.float32)]
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()
with open('./model.tflite', 'wb') as f:
f.write(tflite_model)
When the value of X is below 900 (ish), the edgetpu compiler succeeds. Above 1000, the compiler fails.
Is there a size limit for LSTM? What is the best next step forward?
Output and edgetpu compiler version:
➜ edgetpu_compiler model.tflite
Edge TPU Compiler version 16.0.384591198
Started a compilation timeout timer of 180 seconds.
Compilation child process completed within timeout period.
Compilation failed!
Click to expand!
Issue Type
Bug
Operating System
Ubuntu
Coral Device
Dev Board
Other Devices
No response
Programming Language
Python 3.9
Relevant Log Output
No response
Hello @gworkman Yes, LSTM operation seems to have size limitation.
Please try the below the options here:
- Reduce the model input size
- Map the LSTM op to CPU and other ops to TPU using intermediate tensors flag. In this case, please compare the latency btw uncompiled tflite model vs edgeTPU model.
! edgetpu_compiler -s -a -i "tfl.quantize" model.tflite
Thanks for the quick response! I went for option 1 here, since that made more sense for my model, but it would be helpful for others who come across this limitation to add some error messages to the output, and maybe a mention of the limitation in the docs. I'd be happy to provide a quick PR, if you can point me to where I should take a look.
Thanks again! 😄
Closing this issue as we haven't find any concrete limits for input shape/LSTM number of units to document the limitations. Thanks!