server icon indicating copy to clipboard operation
server copied to clipboard

increase chunk size for streaming with tensorrtllm_backend

Open avianion opened this issue 9 months ago • 0 comments

Is it possible to increase the amount of tokens sent per chunk during the streaming process and how to do so?

This could also be with triton-inference-server

avianion avatar May 17 '24 13:05 avianion