serve Batch inference for dynamic input size

Batch inference for dynamic input size

Open dalvlv opened this issue 2 years ago • 3 comments

📚 The doc issue

I'm doing TTS tasks and my input size is dynamic. I want to use batch inference in torchserve. My question is how to send dynamic inputs to the model? If use padding, how to do. Is there any docs or examles?

Suggest a potential alternative/fix

No response

Aug 12 '22 14:08 dalvlv

@dalvlv Can you provide some more detail on which handler are you using? I am assuming when you say dynamic batch size you are talking about client size batching

Aug 12 '22 18:08 maaquib

Hi @maaquib , I use the handle in text_to_speech_synthesizer. And the example handle seems not to surppot batch inference. text length is not same. I want to synthesis several texts once.

Aug 12 '22 21:08 dalvlv

@dalvlv typically dynamic input sizes are not supported by PyTorch natively as in you need to pad your inputs to some size which is what we do for example in our HuggingFace example https://github.com/pytorch/serve/blob/master/examples/Huggingface_Transformers/Transformer_handler_generalized.py#L190

Sep 02 '22 23:09 msaroufim

serve serve copied to clipboard

Batch inference for dynamic input size

📚 The doc issue

Suggest a potential alternative/fix

serve
serve copied to clipboard