TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

feat: Add BW measurement

Open BatshevaBlack opened this issue 9 months ago • 10 comments

Request BW : This metric is measured from the generation side for each request. It is be calculated as: Request BW = (Total KVCacheSize of Request​)/(Total Time for all Generations to Complete)

BatshevaBlack avatar Mar 25 '25 16:03 BatshevaBlack

@Shunkangz @Shixiaowei02 can you please review?

BatshevaBlack avatar Mar 25 '25 16:03 BatshevaBlack

/bot run

Shunkangz avatar Mar 26 '25 03:03 Shunkangz

PR_Github #505 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 03:03 niukuo

PR_Github #505 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #434 completed with status: 'FAILURE'

niukuo avatar Mar 26 '25 03:03 niukuo

/bot run

Shunkangz avatar Mar 26 '25 14:03 Shunkangz

PR_Github #595 [ run ] triggered by Bot

niukuo avatar Mar 26 '25 14:03 niukuo

PR_Github #595 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #504 completed with status: 'FAILURE'

niukuo avatar Mar 26 '25 14:03 niukuo

/bot run

Shunkangz avatar Mar 27 '25 12:03 Shunkangz

PR_Github #655 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 27 '25 12:03 tensorrt-cicd

PR_Github #655 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #552 completed with status: 'SUCCESS'

tensorrt-cicd avatar Mar 27 '25 19:03 tensorrt-cicd

Thank you for your contribution! @BatshevaBlack

Shixiaowei02 avatar Mar 28 '25 02:03 Shixiaowei02

/bot reuse-pipeline

Shixiaowei02 avatar Mar 28 '25 02:03 Shixiaowei02

PR_Github #665 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Mar 28 '25 02:03 tensorrt-cicd

PR_Github #665 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #655 for commit 64e092b

tensorrt-cicd avatar Mar 28 '25 02:03 tensorrt-cicd