HUIJONG JEONG
Results
1
issues of
HUIJONG JEONG
 According to the [document](https://nvidia.github.io/TensorRT-LLM/advanced/gpt-attention.html#in-flight-batching) it seems like the packed & mixed batch is the default behavior of TensorRT-LLM. So I've conducted an experiment to see the effect of mixed...