server
server copied to clipboard
Issue while loading the model using TIS (Triton Inference Server) : For the model to support batching the shape should have at least 1 dimension and the first dimension must be -1
I have exported code using following format:
torch.onnx.export(crnn.module, # use .module to unwrap the model
example_input, # model input (or a tuple for multiple inputs)
"bhaasha_model.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
verbose=True,
opset_version=17, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input'], # the model's input names
output_names = ['onnx::Shape_261', 'input.79', 'inp'], # the model's output names
dynamic_axes={'input' : {0 : 'batch_size'} , # variable length axes
'onnx::Shape_261': {0: 'batch_size'},
'input.79': {0: 'batch_size'},
'inp': {0: 'batch_size'}
})
The exported model is working fine using onnxruntime.
I am facing issue while loading the model using TIS (Triton Inference Server).
I get this error:
failed to load 'bhaasha_ocr' version 1: Invalid argument: model 'bhaasha_ocr', tensor 'inp': for the model to support batching the shape should have at least 1 dimension and the first dimension must be -1; but shape expected by the model is [64,32,246]
This is very unusual, cause for other two outputs I am not facing this error, only for the third output I am getting this error. How is this possible?
Config.pbtxt :
name: "bhaasha_ocr" backend: "onnxruntime" max_batch_size: 64
input [
{
name: "input"
data_type: TYPE_FP32
dims: [1, 96, 256]
}
]
output [
{
name: "onnx::Shape_261"
data_type: TYPE_FP32
dims: [20, 2]
},
{
name: "input.79"
data_type: TYPE_FP32
dims: [1, 96, 256]
},
{
name: "inp"
data_type: TYPE_FP32
dims: [32, 246]
}
]
dynamic_batching { preferred_batch_size: [2, 4, 8, 16, 32, 64 ] }
How can I resolve this?
I even tried with max_batch_size = 0, it still gives this error: failed to load 'bhaasha_ocr' version 1: Invalid argument: model 'bhaasha_ocr', tensor 'inp': the model expects 3 dimensions (shape [64,32,246]) but the model configuration specifies 3 dimensions (shape [-1,32,246])
name: "bhaasha_ocr" backend: "onnxruntime" max_batch_size: 0
input [ { name: "input" data_type: TYPE_FP32 dims: [-1, 1, 96, 256] } ]
output [ { name: "onnx::Shape_261" data_type: TYPE_FP32 dims: [-1, 20, 2] }, { name: "input.79" data_type: TYPE_FP32 dims: [-1, 1, 96, 256] }, { name: "inp" data_type: TYPE_FP32 dims: [-1, 32, 246] } ]
Hi @Vaishnvi, thanks for sharing such detailed info. Since this is an ONNX model, and the ORT backend supports full config auto-complete, can you try to load the model without any config.pbtxt? This should generate the I/O for the config directly from the model metadata and better help us understand what might be going wrong.