TensorRT-LLM feat: Adding UCX support for cacheTransceiver

Support for KvCache transfer over UCXX backend instead of MPI. To enable the UCX backend the following environment variable need to be set: TRTLLM_USE_UCX_KVCACHE=1

Mar 25 '25 11:03 RoeyAzran1992

Also keeping @pcastonguay @schetlur-nv for vis about this UCX backend support MR for dis-agg serving.

Thanks June

Mar 25 '25 12:03 juney-nvidia

/bot run --add-multi-gpu-test

Mar 25 '25 12:03 Shixiaowei02

PR_Github #436 [ run ] triggered by Bot

Mar 25 '25 12:03 niukuo

PR_Github #436 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #374 completed with status: 'FAILURE'

Mar 25 '25 14:03 niukuo

/bot run --add-multi-gpu-test

Mar 26 '25 01:03 chuangz0

PR_Github #489 [ run ] triggered by Bot

Mar 26 '25 02:03 niukuo

PR_Github #489 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #421 completed with status: 'FAILURE'

Mar 26 '25 03:03 niukuo

/bot run --add-multi-gpu-test

Mar 26 '25 03:03 Shunkangz

PR_Github #509 [ run ] triggered by Bot

Mar 26 '25 04:03 niukuo

PR_Github #509 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #436 completed with status: 'FAILURE'

Mar 26 '25 08:03 niukuo

Since all the commits in this PR are already included in #3101 and have been merged, this PR will be closed. Thank you, @RoeyAzran1992 !

Apr 10 '25 04:04 Shixiaowei02