TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

fix: Reverse graph size order

Open jiahanc opened this issue 9 months ago • 3 comments

During experiment, during cuda graph capture, the graph size oscillates frequently, making total size of graph larger than expected. Reverse the order of graph batch size when capturing and this will make the smaller batch size graph reuse memory used in larger batch size.

jiahanc avatar Mar 27 '25 01:03 jiahanc

/bot run

jiahanc avatar Mar 27 '25 01:03 jiahanc

PR_Github #626 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 27 '25 01:03 tensorrt-cicd

PR_Github #626 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #527 completed with status: 'FAILURE'

tensorrt-cicd avatar Mar 27 '25 03:03 tensorrt-cicd

/bot run

jiahanc avatar Mar 31 '25 20:03 jiahanc

PR_Github #799 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 31 '25 20:03 tensorrt-cicd

PR_Github #799 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #645 completed with status: 'FAILURE'

tensorrt-cicd avatar Mar 31 '25 21:03 tensorrt-cicd

/bot run

jiahanc avatar Apr 01 '25 00:04 jiahanc

PR_Github #807 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 01 '25 00:04 tensorrt-cicd

PR_Github #807 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #653 completed with status: 'SUCCESS'

tensorrt-cicd avatar Apr 01 '25 02:04 tensorrt-cicd

/bot reuse-pipeline

kaiyux avatar Apr 01 '25 03:04 kaiyux

PR_Github #832 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 01 '25 03:04 tensorrt-cicd

PR_Github #832 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #807 for commit f08a5ad

tensorrt-cicd avatar Apr 01 '25 03:04 tensorrt-cicd