server
server copied to clipboard
Triton ensemble not working as expected to support reshape
Description
Hi Team,
I tried to config my ensemble model with reshape : https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html#reshape, which is not working as expected.
For the ensemble model, I have two models : one Python as preprocessor and one onnx mode. Below is the generated config file for each including the ensemble one:
python preprocessor
name: "pre"
backend: "python"
max_batch_size: 8
input {
name: "text"
data_type: TYPE_STRING
dims: 1
reshape {
}
}
output {
name: "input_ids"
data_type: TYPE_INT64
dims: -1
}
output {
name: "attention_mask"
data_type: TYPE_INT64
dims: -1
}
output {
name: "token_type_ids"
data_type: TYPE_INT64
dims: -1
}
dynamic_batching {
max_queue_delay_microseconds: 2000
}
instance_group {
count: 4
}
Model
name: "main_app"
platform: "onnxruntime_onnx"
backend: "onnxruntime"
max_batch_size: 8
input {
name: "input_ids"
data_type: TYPE_INT64
dims: -1
}
input {
name: "attention_mask"
data_type: TYPE_INT64
dims: -1
}
input {
name: "token_type_ids"
data_type: TYPE_INT64
dims: -1
}
output {
name: "embedding"
data_type: TYPE_FP32
dims: 768
}
dynamic_batching {
max_queue_delay_microseconds: 2000
}
instance_group {
count: 4
}
The ensemble
name: "ensemble"
platform: "ensemble"
max_batch_size: 8
input {
name: "text"
data_type: TYPE_STRING
dims: 1
reshape {
}
}
output {
name: "embedding"
data_type: TYPE_FP32
dims: 768
}
ensemble_scheduling {
step {
model_name: "pre"
model_version: -1
input_map {
key: "text"
value: "text"
}
output_map {
key: "token_type_ids"
value: "token_type_ids"
}
output_map {
key: "input_ids"
value: "input_ids"
}
output_map {
key: "attention_mask"
value: "attention_mask"
}
}
step {
model_name: "main_app"
model_version: -1
input_map {
key: "token_type_ids"
value: "token_type_ids"
}
input_map {
key: "input_ids"
value: "input_ids"
}
input_map {
key: "attention_mask"
value: "attention_mask"
}
output_map {
key: "embedding"
value: "embedding"
}
}
}
We can see from the config that both ensemble and preprocessor has reshape set. However, when I started the Triton server, I got below error
E0617 02:43:08.069270 106905 model_repository_manager.cc:563] Invalid argument: in ensemble ensemble, ensemble tensor text: inconsistent shape: [-1] is inferred from model ensemble while [-1,1] is inferred from model pre
Could you please help check and advise what's the issue and how we can address it? thanks
Triton Information What version of Triton are you using? 23.01 Are you using the Triton container or did you build it yourself? Triton container To Reproduce The above config showed it
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior A clear and concise description of what you expected to happen.