edgetpu icon indicating copy to clipboard operation
edgetpu copied to clipboard

LSTM Size Constraints

Open gworkman opened this issue 3 years ago • 2 comments

Description

It seems like there is a limit on supported LSTM sizes. I have a model that uses an LSTM on MobileNetV2 features extracted from video frames. The LSTM input shape is (5, 1280). While the model converts to TFLite and quantizes perfectly, the edgetpu_compiler fails without any information.

A toy example:

X = 100


model_input = tf.keras.Input((5, X), dtype=tf.float32)
lstm = tf.keras.layers.LSTM(32)
dense = tf.keras.layers.Dense(3, activation='softmax')

x = lstm(model_input)
x = dense(x)

model = tf.keras.Model(model_input, x)

model.input.set_shape((1,) + model.input.shape[1:])
print(model.summary())

def representative_dataset():
  for i in range(10):
      yield [np.random.random((1, 5, X)).astype(np.float32)]

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()

with open('./model.tflite', 'wb') as f:
    f.write(tflite_model)

When the value of X is below 900 (ish), the edgetpu compiler succeeds. Above 1000, the compiler fails.

Is there a size limit for LSTM? What is the best next step forward?

Output and edgetpu compiler version:

➜ edgetpu_compiler model.tflite
Edge TPU Compiler version 16.0.384591198
Started a compilation timeout timer of 180 seconds.
Compilation child process completed within timeout period.
Compilation failed!
Click to expand!

Issue Type

Bug

Operating System

Ubuntu

Coral Device

Dev Board

Other Devices

No response

Programming Language

Python 3.9

Relevant Log Output

No response

gworkman avatar May 28 '22 03:05 gworkman

Hello @gworkman Yes, LSTM operation seems to have size limitation.

Please try the below the options here:

  1. Reduce the model input size
  2. Map the LSTM op to CPU and other ops to TPU using intermediate tensors flag. In this case, please compare the latency btw uncompiled tflite model vs edgeTPU model.

! edgetpu_compiler -s -a -i "tfl.quantize" model.tflite

hjonnala avatar May 30 '22 04:05 hjonnala

Thanks for the quick response! I went for option 1 here, since that made more sense for my model, but it would be helpful for others who come across this limitation to add some error messages to the output, and maybe a mention of the limitation in the docs. I'd be happy to provide a quick PR, if you can point me to where I should take a look.

Thanks again! 😄

gworkman avatar May 30 '22 19:05 gworkman

Closing this issue as we haven't find any concrete limits for input shape/LSTM number of units to document the limitations. Thanks!

hjonnala avatar Sep 12 '22 16:09 hjonnala

Are you satisfied with the resolution of your issue? Yes No

google-coral-bot[bot] avatar Sep 12 '22 16:09 google-coral-bot[bot]