FastDeploy icon indicating copy to clipboard operation
FastDeploy copied to clipboard

[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode

Open wuyujiji opened this issue 4 weeks ago • 4 comments

Motivation

为了适配paddleocr-vl模型,特在天数硬件上支持V1_KVCACHE_SCHEDULER和paddle ocr vl的rope模式。除此之外,还验证了打开V1_KVCACHE_SCHEDULER后,之前适配的ERNIE纯文模型和ERNIE VL模型系列精度均正常

Modifications

Pass

Usage or Command

Pass

Accuracy Tests

Pass

Checklist

  • [x] Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • [x] Format your code, run pre-commit before commit.
  • [x] Add unit tests. Please write the reason in this PR if no unit tests.
  • [x] Provide accuracy results.
  • [x] If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

wuyujiji avatar Dec 15 '25 07:12 wuyujiji

Thanks for your contribution!

paddle-bot[bot] avatar Dec 15 '25 07:12 paddle-bot[bot]

Codecov Report

:x: Patch coverage is 10.52632% with 34 lines in your changes missing coverage. Please review. :warning: Please upload report for BASE (develop@404cf0e). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...executor/layers/attention/iluvatar_attn_backend.py 8.33% 22 Missing :warning:
...del_executor/models/ernie4_5_vl/ernie4_5_vl_moe.py 20.00% 2 Missing and 2 partials :warning:
...model_executor/models/paddleocr_vl/paddleocr_vl.py 20.00% 2 Missing and 2 partials :warning:
fastdeploy/engine/sched/resource_manager_v1.py 0.00% 2 Missing :warning:
fastdeploy/engine/args_utils.py 0.00% 0 Missing and 1 partial :warning:
fastdeploy/worker/worker_process.py 0.00% 0 Missing and 1 partial :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #5555   +/-   ##
==========================================
  Coverage           ?   63.80%           
==========================================
  Files              ?      329           
  Lines              ?    41743           
  Branches           ?     6386           
==========================================
  Hits               ?    26636           
  Misses             ?    13081           
  Partials           ?     2026           
Flag Coverage Δ
GPU 63.80% <10.52%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Dec 15 '25 10:12 codecov-commenter

天数支持多模请求的多batch嘛?当前v1里边都是放开的,可能要关注一下

kevincheng2 avatar Dec 15 '25 13:12 kevincheng2

天数支持多模请求的多batch嘛?当前v1里边都是放开的,可能要关注一下

@kevincheng2 应该是支持的,有多batch的脚本吗,我可以测一下

wuyujiji avatar Dec 16 '25 01:12 wuyujiji