[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode
Motivation
为了适配paddleocr-vl模型,特在天数硬件上支持V1_KVCACHE_SCHEDULER和paddle ocr vl的rope模式。除此之外,还验证了打开V1_KVCACHE_SCHEDULER后,之前适配的ERNIE纯文模型和ERNIE VL模型系列精度均正常
Modifications
Pass
Usage or Command
Pass
Accuracy Tests
Pass
Checklist
- [x] Add at least a tag in the PR title.
- Tag list: [
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]] - You can add new tags based on the PR content, but the semantics must be clear.
- Tag list: [
- [x] Format your code, run
pre-commitbefore commit. - [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [x] Provide accuracy results.
- [x] If the current PR is submitting to the
releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.
Thanks for your contribution!
Codecov Report
:x: Patch coverage is 10.52632% with 34 lines in your changes missing coverage. Please review.
:warning: Please upload report for BASE (develop@404cf0e). Learn more about missing BASE report.
Additional details and impacted files
@@ Coverage Diff @@
## develop #5555 +/- ##
==========================================
Coverage ? 63.80%
==========================================
Files ? 329
Lines ? 41743
Branches ? 6386
==========================================
Hits ? 26636
Misses ? 13081
Partials ? 2026
| Flag | Coverage Δ | |
|---|---|---|
| GPU | 63.80% <10.52%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
天数支持多模请求的多batch嘛?当前v1里边都是放开的,可能要关注一下
天数支持多模请求的多batch嘛?当前v1里边都是放开的,可能要关注一下
@kevincheng2 应该是支持的,有多batch的脚本吗,我可以测一下