server
server copied to clipboard
Can ragged input used together with stateful model?
I have a PyTorch stateful model which I successfully used in the triton server with sequence_batching direct scheduling strategy. In order to futher optimize the throughput, I want to use ragged batching. After modification of my model to handle ragged input, I noticed that the PyTorch backend does not generate the configured batch_inputs and crushes when encountering a nullptr for that batch input. After examining the document, I noticed that ragged input is used with dynamic batching. I assume that is the reason why I cannot use it with the sequence batching direct scheduling strategy. However, I wonder whether the ragged inputs can be used with the Oldest scheduling strategy since the oldest scheduling strategy actually uses a dynamic batcher. Besides that, is possible for adding support for ragged input with sequence batching direct scheduling strategy?
According to this issue. https://github.com/triton-inference-server/server/issues/2158 It seems that it is the PyTorch backend who does not support ragged input. Is the ragged input mainly implemented by individual backends? Are there any restrictions for using ragged input together with sequence batching? Is it feasible for me to customize an backend specific for my model and add support for ragged input?
Apologies we didn't get to this! Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue