server icon indicating copy to clipboard operation
server copied to clipboard

Issue while loading the model using TIS (Triton Inference Server) : For the model to support batching the shape should have at least 1 dimension and the first dimension must be -1

Open Vaishnvi opened this issue 1 year ago • 2 comments

I have exported code using following format:

          torch.onnx.export(crnn.module,         # use .module to unwrap the model
					example_input,       # model input (or a tuple for multiple inputs)
					"bhaasha_model.onnx",        # where to save the model (can be a file or file-like object)
					export_params=True,  # store the trained parameter weights inside the model file
					verbose=True,
					opset_version=17,    # the ONNX version to export the model to
					do_constant_folding=True,  # whether to execute constant folding for optimization
					input_names = ['input'],   # the model's input names
					output_names = ['onnx::Shape_261', 'input.79', 'inp'], # the model's output names
					dynamic_axes={'input' : {0 : 'batch_size'} ,    # variable length axes
                                    'onnx::Shape_261': {0: 'batch_size'},
                                    'input.79': {0: 'batch_size'},
                                    'inp': {0: 'batch_size'}
                                    })

The exported model is working fine using onnxruntime.

I am facing issue while loading the model using TIS (Triton Inference Server).

I get this error:

failed to load 'bhaasha_ocr' version 1: Invalid argument: model 'bhaasha_ocr', tensor 'inp': for the model to support batching the shape should have at least 1 dimension and the first dimension must be -1; but shape expected by the model is [64,32,246]

This is very unusual, cause for other two outputs I am not facing this error, only for the third output I am getting this error. How is this possible?

Config.pbtxt :

name: "bhaasha_ocr" backend: "onnxruntime" max_batch_size: 64

input [ { name: "input" data_type: TYPE_FP32 dims: [1, 96, 256]
} ]

output [ { name: "onnx::Shape_261" data_type: TYPE_FP32 dims: [20, 2]
}, { name: "input.79" data_type: TYPE_FP32 dims: [1, 96, 256]
}, { name: "inp" data_type: TYPE_FP32 dims: [32, 246] } ]

dynamic_batching { preferred_batch_size: [2, 4, 8, 16, 32, 64 ] }

How can I resolve this?

Vaishnvi avatar Jul 22 '24 10:07 Vaishnvi

I even tried with max_batch_size = 0, it still gives this error: failed to load 'bhaasha_ocr' version 1: Invalid argument: model 'bhaasha_ocr', tensor 'inp': the model expects 3 dimensions (shape [64,32,246]) but the model configuration specifies 3 dimensions (shape [-1,32,246])

name: "bhaasha_ocr" backend: "onnxruntime" max_batch_size: 0

input [ { name: "input" data_type: TYPE_FP32 dims: [-1, 1, 96, 256] } ]

output [ { name: "onnx::Shape_261" data_type: TYPE_FP32 dims: [-1, 20, 2] }, { name: "input.79" data_type: TYPE_FP32 dims: [-1, 1, 96, 256] }, { name: "inp" data_type: TYPE_FP32 dims: [-1, 32, 246] } ]

Vaishnvi avatar Jul 22 '24 11:07 Vaishnvi

Hi @Vaishnvi, thanks for sharing such detailed info. Since this is an ONNX model, and the ORT backend supports full config auto-complete, can you try to load the model without any config.pbtxt? This should generate the I/O for the config directly from the model metadata and better help us understand what might be going wrong.

rmccorm4 avatar Jul 31 '24 22:07 rmccorm4