tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

Unable to use LSTM/GRU/SimpleRNN with mask of dynamic shape in OnnxRuntime/TensorRT

Open AndreyOrb opened this issue 1 year ago • 0 comments

Describe the issue

I have this TF/keras model:

def build_model():
    image_input = Input(shape=(None, 1), name='image', dtype='float32')
    img_width_input = Input(shape=(), name='width', dtype='int32')

    max_width = tf.reduce_max(img_width_input)
    mask = tf.sequence_mask(img_width_input, max_width)

    lstm_out = LSTM(3)(image_input, mask=mask)

    _model = Model(inputs=[image_input, img_width_input], outputs=lstm_out)

    return _model

image

I save it as a SavedModel and then convert to onnx format (opset 16) using tf2onnx.

It runs fine with OnnxRuntime with CudaExecutionProvider, but returns unpadded values with TensorrtExecutionProvider.

When trying to check the model with trtexec, I get this error: [07/25/2023-12:18:51] [E] Error[7]: [shapeMachine.cpp::nvinfer1::rt::ShapeMachineRoutine::executeContinuation::864] Error Code 7: Internal Error (while/TensorArrayV2Read_1/TensorListGetItem: cannot do non-empty gather from an empty axis Condition '<' violated: 0 >= 0. Instruction: CHECK_LESS 0 0.)

The issue disappears if I use an explicit value for a mask size.

def build_model():
    image_input = Input(shape=(None, 1), name='image', dtype='float32')
    img_width_input = Input(shape=(), name='width', dtype='int32')

    # max_width = tf.reduce_max(img_width_input)
    mask = tf.sequence_mask(img_width_input, 5)

    lstm_out = LSTM(3)(image_input, mask=mask)

    _model = Model(inputs=[image_input, img_width_input], outputs=lstm_out)

    return _model

image

graph

From my understanding, the LSTM uses a while loop with a Gather operator (named TensorListGetItem). That Gather op validates a shape of a mask and fails. Due to the nature of a project I must use a mask of dynamic shape, as the shape is calculated based on an image size.

I also tried running the onnx model itself with polygraphy and got this error: onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Loop node. Name:'StatefulPartitionedCall/model/lstm/PartitionedCall/while_loop' Status Message: Non-zero status code returned while running Gather node. Name:'while/TensorArrayV2Read_1/TensorListGetItem' Status Message: indices element out of data bounds, idx=0 must be within the inclusive range [0,-1]

Exactly the same error raises if I replace LSTM with GRU or SimpleRNN.

It could be related to: https://github.com/microsoft/onnxruntime/issues/8504 https://github.com/onnx/tensorflow-onnx/issues/1635 https://github.com/huggingface/transformers/issues/4523

I created a similar issue in onnxruntime repository, but it seems that the issue is actually with the converter: https://github.com/microsoft/onnxruntime/issues/16885

To reproduce

  1. Create a model and save as a SavedModel:
from keras.models import Model
from keras.layers import Input, LSTM
import tensorflow as tf


def build_model():
    image_input = Input(shape=(None, 1), name='image', dtype='float32')
    img_width_input = Input(shape=(), name='width', dtype='int32')

    max_width = tf.reduce_max(img_width_input)
    mask = tf.sequence_mask(img_width_input, max_width)

    lstm_out = LSTM(3)(image_input, mask=mask)

    _model = Model(inputs=[image_input, img_width_input], outputs=lstm_out)

    return _model


model = build_model()


# Display model summary
# model.summary(line_length=130, expand_nested=True)

model.save('test_model.sm', overwrite=True)
  1. Convert to ONNX format: python -m tf2onnx.convert --saved-model test_model.sm --output test_model.onnx --opset 16

  2. Verify the model with trtexec: trtexec.exe --onnx=model.onnx

  3. Verify the model with polygraphy polygraphy run --onnxrt test_model.onnx polygraphy run --trt test_model.onnx

Urgency

I need to come with a fix or a workaround to the issue during by the mid-august.

Platform

Windows

OS Version

10

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

Latest main branch: SHA-1: eeef15788804b0217d46ad6d1c2744585ad98d0c

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU, CUDA, TensorRT

Execution Provider Library Version

CUDA 11.8, CUDNN 8.9, TensorRT 8.6

AndreyOrb avatar Aug 17 '23 19:08 AndreyOrb