TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

feat: Add support of chat completion in PD

Open Shunkangz opened this issue 9 months ago • 7 comments

Add support of chat completion in PD and fix the include_usage option.

Shunkangz avatar Mar 23 '25 14:03 Shunkangz

/bot run

Shunkangz avatar Mar 23 '25 14:03 Shunkangz

@chuangz0 @xiaoweiw-nv @pcastonguay pls help review this MR.

Thanks June

juney-nvidia avatar Mar 23 '25 22:03 juney-nvidia

/bot run

Shixiaowei02 avatar Mar 24 '25 08:03 Shixiaowei02

PR_Github #269 [ run ] triggered by Bot

niukuo avatar Mar 24 '25 08:03 niukuo

PR_Github #269 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #259 completed with status: 'FAILURE'

niukuo avatar Mar 24 '25 12:03 niukuo

There are no tests for the chat endpoint. Could we add one please? Thanks.

Sure. I think that the modification of chat endpoint should be orthogonal to the parallel strategy of context and generation workers. Do we need to double all existing tests for chat or just add single test should be enough? @pcastonguay

Shunkangz avatar Mar 26 '25 06:03 Shunkangz

There are no tests for the chat endpoint. Could we add one please? Thanks.

Sure. I think that the modification of chat endpoint should be orthogonal to the parallel strategy of context and generation workers. Do we need to double all existing tests for chat or just add single test should be enough? @pcastonguay

Single test should be enough.

pcastonguay avatar Mar 26 '25 13:03 pcastonguay

/bot run --add-multi-gpu-test

Shunkangz avatar Apr 08 '25 09:04 Shunkangz

PR_Github #1438 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 08 '25 09:04 tensorrt-cicd

PR_Github #1438 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #1079 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 08 '25 11:04 tensorrt-cicd

/bot run --add-multi-gpu-test

Shunkangz avatar Apr 09 '25 14:04 Shunkangz

PR_Github #1629 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 09 '25 15:04 tensorrt-cicd

/bot run --add-multi-gpu-test

Shunkangz avatar Apr 10 '25 01:04 Shunkangz

PR_Github #1668 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 10 '25 01:04 tensorrt-cicd

PR_Github #1629 [ run ] completed with state ABORTED /LLM/main/L0_MergeRequest_PR pipeline #1219 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 10 '25 01:04 tensorrt-cicd

/bot run --add-multi-gpu-test

Shunkangz avatar Apr 10 '25 01:04 Shunkangz

PR_Github #1669 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 10 '25 01:04 tensorrt-cicd

PR_Github #1668 [ run ] completed with state ABORTED

tensorrt-cicd avatar Apr 10 '25 01:04 tensorrt-cicd

/bot run --add-multi-gpu-test

Shunkangz avatar Apr 10 '25 01:04 Shunkangz

PR_Github #1671 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 10 '25 01:04 tensorrt-cicd

PR_Github #1669 [ run ] completed with state ABORTED

tensorrt-cicd avatar Apr 10 '25 01:04 tensorrt-cicd

PR_Github #1671 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #1249 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 10 '25 05:04 tensorrt-cicd

/bot run --only-multi-gpu-test

Shunkangz avatar Apr 10 '25 07:04 Shunkangz

PR_Github #1726 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 10 '25 07:04 tensorrt-cicd

PR_Github #1726 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #1288 (Partly Tested) completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 10 '25 13:04 tensorrt-cicd

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2, DGX_H100-4_GPUs-TensorRT-1, DGX_H100-4_GPUs-TensorRT-2"

byshiue avatar Apr 10 '25 14:04 byshiue

PR_Github #1785 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 10 '25 14:04 tensorrt-cicd

PR_Github #1785 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #1323 (Partly Tested) completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 11 '25 02:04 tensorrt-cicd

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2"

Shunkangz avatar Apr 11 '25 02:04 Shunkangz

PR_Github #1838 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 11 '25 02:04 tensorrt-cicd