TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

feat: Adding UCX support for cacheTransceiver

Open RoeyAzran1992 opened this issue 9 months ago • 10 comments

Support for KvCache transfer over UCXX backend instead of MPI. To enable the UCX backend the following environment variable need to be set: TRTLLM_USE_UCX_KVCACHE=1

RoeyAzran1992 avatar Mar 25 '25 11:03 RoeyAzran1992

Also keeping @pcastonguay @schetlur-nv for vis about this UCX backend support MR for dis-agg serving.

Thanks June

juney-nvidia avatar Mar 25 '25 12:03 juney-nvidia

/bot run --add-multi-gpu-test

Shixiaowei02 avatar Mar 25 '25 12:03 Shixiaowei02

PR_Github #436 [ run ] triggered by Bot

niukuo avatar Mar 25 '25 12:03 niukuo

PR_Github #436 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #374 completed with status: 'FAILURE'

niukuo avatar Mar 25 '25 14:03 niukuo

/bot run --add-multi-gpu-test

chuangz0 avatar Mar 26 '25 01:03 chuangz0

PR_Github #489 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 02:03 niukuo

PR_Github #489 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #421 completed with status: 'FAILURE'

niukuo avatar Mar 26 '25 03:03 niukuo

/bot run --add-multi-gpu-test

Shunkangz avatar Mar 26 '25 03:03 Shunkangz

PR_Github #509 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 04:03 niukuo

PR_Github #509 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #436 completed with status: 'FAILURE'

niukuo avatar Mar 26 '25 08:03 niukuo

Since all the commits in this PR are already included in #3101 and have been merged, this PR will be closed. Thank you, @RoeyAzran1992 !

Shixiaowei02 avatar Apr 10 '25 04:04 Shixiaowei02