coremltools coremltools does not export LSTM state?

coremltools does not export LSTM state?

Open vyshemirsky opened this issue 1 year ago • 1 comments

❓Question

I am trying to convert an LSTM based model from tensorflow to CoreML to be used in a MacOS application. Despite of looking at hundreds of examples in documentation, and spending two days through all possible references to a similar problem, I can't find a solution. Here is a compact example that replicates the problem.

First we define a very simple LSTM model and export it to CoreML using coremltools:

import os
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow import keras
import coremltools 

model = keras.Sequential()
model.add(layers.Input(dtype="float32", batch_input_shape=(1,1,1)))
model.add(layers.LSTM(3,))
model.add(layers.Dense(1))
model.summary()


coreml_model_file = './test.mlpackage'

mlmodel = coremltools.convert(model,source="tensorflow")
mlmodel.save(coreml_model_file)

tensorflow version is 2.13.0, coremltools version is 6.3.0, python version 3.11. Tried with tensorflow 2.12.0, does not change a thing.

Dragging .mlpackage to Xcode performs perfect import. But I get the following as model specification: SCREENSHOT OF MY MODEL

Documentation (https://developer.apple.com/documentation/coreml/making_predictions_with_a_sequence_of_inputs?language=objc) suggests that the states of the LSTM layer should be exported as additional inputs and outputs, so the network can be applied to a new arbitrary sequence. Whatever I do I cannot get statein/stateout to appear in CoreML model.

What am I doing wrong? How do I export an LSTM model properly?

Tried different versions of tensorflow 2, coremltools. Stateful LSTM does not export at all. Changing input shape to (1, None, 1) does not help either.

Will really appreciate some help.

Sep 08 '23 12:09 vyshemirsky

@vyshemirsky Coreml conversion will have a parity of your original LSTM model. i.e., if your original model is 1 input and 1 output, so do the converter coreml model.

I ran. the following code in your example:

import os
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow import keras
import coremltools 

model = keras.Sequential()
model.add(layers.Input(dtype="float32", batch_input_shape=(1,1,1)))
model.add(layers.LSTM(3,))
model.add(layers.Dense(1))
model.summary()
print(model.inputs)
print(model.outputs)

and I see the i/o for the keras model is 1 input and 1 output:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm (LSTM)                 (1, 3)                    60        
                                                                 
 dense (Dense)               (1, 1)                    4         
                                                                 
=================================================================
Total params: 64
Trainable params: 64
Non-trainable params: 0
_________________________________________________________________
[<KerasTensor: shape=(1, 1, 1) dtype=float32 (created by layer 'input_1')>]
[<KerasTensor: shape=(1, 1) dtype=float32 (created by layer 'dense')>]

So I think what you are observing is a correct bahavior.

Sep 18 '23 20:09 jakesabathia2

coremltools coremltools copied to clipboard

coremltools does not export LSTM state?

❓Question

coremltools
coremltools copied to clipboard