feat: Add support of chat completion in PD
Add support of chat completion in PD and fix the include_usage option.
/bot run
@chuangz0 @xiaoweiw-nv @pcastonguay pls help review this MR.
Thanks June
/bot run
PR_Github #269 [ run ] triggered by Bot
PR_Github #269 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #259 completed with status: 'FAILURE'
There are no tests for the chat endpoint. Could we add one please? Thanks.
Sure. I think that the modification of chat endpoint should be orthogonal to the parallel strategy of context and generation workers. Do we need to double all existing tests for chat or just add single test should be enough? @pcastonguay
There are no tests for the chat endpoint. Could we add one please? Thanks.
Sure. I think that the modification of chat endpoint should be orthogonal to the parallel strategy of context and generation workers. Do we need to double all existing tests for chat or just add single test should be enough? @pcastonguay
Single test should be enough.
/bot run --add-multi-gpu-test
PR_Github #1438 [ run ] triggered by Bot
PR_Github #1438 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1079 completed with status: 'FAILURE'
/bot run --add-multi-gpu-test
PR_Github #1629 [ run ] triggered by Bot
/bot run --add-multi-gpu-test
PR_Github #1668 [ run ] triggered by Bot
PR_Github #1629 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #1219 completed with status: 'FAILURE'
/bot run --add-multi-gpu-test
PR_Github #1669 [ run ] triggered by Bot
PR_Github #1668 [ run ] completed with state ABORTED
/bot run --add-multi-gpu-test
PR_Github #1671 [ run ] triggered by Bot
PR_Github #1669 [ run ] completed with state ABORTED
PR_Github #1671 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1249 completed with status: 'FAILURE'
/bot run --only-multi-gpu-test
PR_Github #1726 [ run ] triggered by Bot
PR_Github #1726 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1288 (Partly Tested) completed with status: 'FAILURE'
/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2, DGX_H100-4_GPUs-TensorRT-1, DGX_H100-4_GPUs-TensorRT-2"
PR_Github #1785 [ run ] triggered by Bot
PR_Github #1785 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1323 (Partly Tested) completed with status: 'FAILURE'
/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2"
PR_Github #1838 [ run ] triggered by Bot