TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

perf: Enable CUDA graphs when attention DP is used and active requests on different GPUs are uneven

Open jinyangyuan-nvidia opened this issue 9 months ago • 19 comments

This PR modifies the code related to dummy requests to allow the use of CUDA graphs when attention DP is used and active requests on different GPUs are uneven.

jinyangyuan-nvidia avatar Mar 24 '25 06:03 jinyangyuan-nvidia

/bot run

jinyangyuan-nvidia avatar Mar 24 '25 08:03 jinyangyuan-nvidia

PR_Github #266 [ run ] triggered by Bot

niukuo avatar Mar 24 '25 08:03 niukuo

PR_Github #266 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #256 completed with status: 'FAILURE'

niukuo avatar Mar 24 '25 12:03 niukuo

/bot run

jinyangyuan-nvidia avatar Mar 24 '25 15:03 jinyangyuan-nvidia

PR_Github #322 [ run ] triggered by Bot

niukuo avatar Mar 24 '25 15:03 niukuo

PR_Github #322 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #300 completed with status: 'FAILURE'

niukuo avatar Mar 24 '25 17:03 niukuo

/bot run

jinyangyuan-nvidia avatar Mar 25 '25 01:03 jinyangyuan-nvidia

PR_Github #353 [ run ] triggered by Bot

niukuo avatar Mar 25 '25 02:03 niukuo

PR_Github #353 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #323 completed with status: 'FAILURE'

niukuo avatar Mar 25 '25 04:03 niukuo

/bot run

jinyangyuan-nvidia avatar Mar 25 '25 09:03 jinyangyuan-nvidia

PR_Github #410 [ ] completed with state FAILURE

tensorrt-cicd avatar Mar 25 '25 09:03 tensorrt-cicd

PR_Github #414 [ ] completed with state FAILURE

tensorrt-cicd avatar Mar 25 '25 09:03 tensorrt-cicd

PR_Github #418 [ run ] triggered by Bot

niukuo avatar Mar 25 '25 09:03 niukuo

PR_Github #418 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #362 completed with status: 'SUCCESS'

niukuo avatar Mar 25 '25 13:03 niukuo

/bot run

jinyangyuan-nvidia avatar Mar 25 '25 16:03 jinyangyuan-nvidia

PR_Github #450 [ run ] triggered by Bot

niukuo avatar Mar 25 '25 16:03 niukuo

PR_Github #450 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #385 completed with status: 'FAILURE'

niukuo avatar Mar 25 '25 17:03 niukuo

/bot run

jinyangyuan-nvidia avatar Mar 26 '25 01:03 jinyangyuan-nvidia

PR_Github #483 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 01:03 niukuo

PR_Github #483 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #416 completed with status: 'FAILURE'

niukuo avatar Mar 26 '25 02:03 niukuo

/bot run --disable-fail-fast

jinyangyuan-nvidia avatar Mar 26 '25 02:03 jinyangyuan-nvidia

PR_Github #494 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 02:03 niukuo

/bot kill

jinyangyuan-nvidia avatar Mar 26 '25 03:03 jinyangyuan-nvidia

PR_Github #502 [ kill ] triggered by Bot

niukuo avatar Mar 26 '25 03:03 niukuo

PR_Github #502 [ kill ] completed with state SUCCESS Successfully killed previous jobs for commit 802f729

niukuo avatar Mar 26 '25 03:03 niukuo

/bot --help

jinyangyuan-nvidia avatar Mar 26 '25 03:03 jinyangyuan-nvidia

/bot run

jinyangyuan-nvidia avatar Mar 26 '25 03:03 jinyangyuan-nvidia

PR_Github #504 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 03:03 niukuo

PR_Github #504 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #433 completed with status: 'FAILURE'

niukuo avatar Mar 26 '25 03:03 niukuo

/bot run

jinyangyuan-nvidia avatar Mar 26 '25 04:03 jinyangyuan-nvidia