HUIJONG JEONG issues

Repositories
Issues
Comments

Results 1 issues of


                                            HUIJONG JEONG

In-flight batching and mixed batch

![Image](https://github.com/user-attachments/assets/8daf8c7b-718e-4290-a021-5da9a8619cb1) According to the [document](https://nvidia.github.io/TensorRT-LLM/advanced/gpt-attention.html#in-flight-batching) it seems like the packed & mixed batch is the default behavior of TensorRT-LLM. So I've conducted an experiment to see the effect of mixed...