TensorRT-LLM Increase chunk size while streaming

Increase chunk size while streaming

Open avianion opened this issue 9 months ago • 1 comments

Is it possible to increase the amount of tokens sent per chunk during the streaming process and how to do so?

This could also be with triton-inference-server

May 17 '24 13:05 avianion