TensorRT-LLM feat: Add pp support for hybrid attn/mamba model

May 15 '25 13:05 yuxianq

/bot run --disable-fail-fast --add-multi-gpu-test

May 16 '25 05:05 yuxianq

PR_Github #5461 [ run ] triggered by Bot

May 16 '25 05:05 tensorrt-cicd

PR_Github #5461 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #3984 completed with status: 'SUCCESS'

May 16 '25 10:05 tensorrt-cicd

Thanks @yuxianq , this looks great.

It would be great, however, to see if quality changes with this feature. We have another PR #4147 with this test.

@suyoggupta

May 16 '25 18:05 vegaluisjose

could you please add a brief PR description?

May 16 '25 19:05 suyoggupta

Added copilot to also review this PR

May 16 '25 21:05 suyoggupta

@vegaluisjose I have cherry-pick https://github.com/NVIDIA/TensorRT-LLM/pull/4147 to this PR and the new test can pass locally.

May 19 '25 06:05 yuxianq

could you please add a brief PR description?

@suyoggupta Added.

May 19 '25 06:05 yuxianq

/bot reuse-pipeline

May 19 '25 06:05 yuxianq

PR_Github #5681 [ reuse-pipeline ] triggered by Bot

May 19 '25 06:05 tensorrt-cicd

PR_Github #5681 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #5461 for commit ba33051

May 19 '25 06:05 tensorrt-cicd