TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

perf: Use pinned H2D to reduce bubbles

Open jinyangyuan-nvidia opened this issue 9 months ago • 3 comments

In some cases, some pageable H2D operations are followed by cudaStreamSynchronize operations, which block kernel launches on CPU. This problem can be solved by changing pageable H2D to pinned H2D.

jinyangyuan-nvidia avatar Mar 29 '25 09:03 jinyangyuan-nvidia

/bot run

jinyangyuan-nvidia avatar Mar 29 '25 09:03 jinyangyuan-nvidia

PR_Github #688 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 29 '25 09:03 tensorrt-cicd

PR_Github #688 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #577 completed with status: 'SUCCESS'

tensorrt-cicd avatar Mar 29 '25 19:03 tensorrt-cicd

/bot run --add-multi-gpu-test

jinyangyuan-nvidia avatar Apr 03 '25 08:04 jinyangyuan-nvidia

PR_Github #1096 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 03 '25 08:04 tensorrt-cicd

PR_Github #1096 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #836 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 03 '25 17:04 tensorrt-cicd

/bot run --add-multi-gpu-test

jinyangyuan-nvidia avatar Apr 04 '25 07:04 jinyangyuan-nvidia

PR_Github #1165 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 04 '25 07:04 tensorrt-cicd

PR_Github #1165 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #874 completed with status: 'SUCCESS'

tensorrt-cicd avatar Apr 04 '25 11:04 tensorrt-cicd

/bot reuse-pipeline

jinyangyuan-nvidia avatar Apr 04 '25 13:04 jinyangyuan-nvidia

PR_Github #1176 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 04 '25 13:04 tensorrt-cicd

PR_Github #1176 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #1165 for commit 2835f2b

tensorrt-cicd avatar Apr 04 '25 14:04 tensorrt-cicd

/bot reuse-pipeline

jinyangyuan-nvidia avatar Apr 04 '25 14:04 jinyangyuan-nvidia

PR_Github #1178 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 04 '25 14:04 tensorrt-cicd

PR_Github #1178 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #1165 for commit 2efb7da

tensorrt-cicd avatar Apr 04 '25 14:04 tensorrt-cicd