rtp-llm icon indicating copy to clipboard operation
rtp-llm copied to clipboard

feat: improve pymodel bert perf

Open JackTan25 opened this issue 2 months ago • 10 comments

  1. support copy kernel for new prefill cuda graph framework
  2. improve cuda graph framework cpu perf
  3. test and check: 3.1 In long text scene, pymodel bert is better than cpp engine 3.2 In short text scene, cuda graph pymodel bert can improve performance up to 20%, but failed to conqueue cpp engine, the reason is that the pymodel prepare work cost more 0.2~0.3ms than cpp engine. We will improve this in next PR.

JackTan25 avatar Oct 21 '25 08:10 JackTan25

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 19 '25 09:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 19 '25 11:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 23 '25 15:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 23 '25 15:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 24 '25 02:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 24 '25 02:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 24 '25 02:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 24 '25 02:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 24 '25 02:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 24 '25 08:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 03:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 03:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 03:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 05:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 06:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 06:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 07:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 07:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 08:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 08:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 09:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 09:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 10:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 11:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 11:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 12:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 13:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 14:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 14:11 github-actions[bot]

internal source has been updated, please review the changes!

github-actions[bot] avatar Nov 25 '25 15:11 github-actions[bot]