[Enhancement] Add label-formatted statistics support to QueryDumpDeserializer
Why I'm doing:
PR #65899 introduced label-formatted statistics output ([MIN: 1.0, MAX: 100.0, ...]) controlled by enable_labeled_column_statistic_output, but QueryDumpDeserializer only parsed the old format ([1.0, 100.0, ...]), breaking deserialization of dumps generated with the new format.
What I'm doing:
Enhanced ColumnStatistic.buildFrom(String) to auto-detect and parse both formats:
Old format:
"[1.0, 100.0, 0.0, 8.0, 50.0] ESTIMATE"
// [minValue, maxValue, nullsFraction, averageRowSize, distinctValuesCount]
New labeled format:
"[MIN: 1.0, MAX: 100.0, NULLS: 0.0, ROS: 8.0, NDV: 50.0] ESTIMATE"
Key changes:
- Auto-detects format by checking for
MIN:label - Added
parseLabeledValue()helper for extracting labeled values - Preserves all existing validation (range checks, NDV normalization)
- Refactored tests to use parameterized tests with JUnit 5's
@ParameterizedTestand@MethodSource, reducing test code from 307 to 247 lines (60 lines saved, ~20% reduction) while maintaining comprehensive coverage - Added comprehensive unit tests covering both formats, edge cases, and end-to-end deserialization
- Added integration test in
QueryDumperTestto verify labeled statistics work when session variable is enabled
What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [x] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [x] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4
- [ ] 3.3
Original prompt
This section details on the original issue you should resolve
<issue_title>Enahnce QueryDumpDeserializer to support label-formatted statistics</issue_title> <issue_description>## Enhancement
#https://github.com/StarRocks/starrocks/pull/65899 introduce a new statistics format, but current
QueryDumpDeserializerdoesn't support it yet, which will break the compatibility.So the
QueryDumpDeserializeralso needs to be upgraded to support it.</issue_description><agent_instructions>1. implement the code 2. add ut</agent_instructions>
Comments on the Issue (you are @copilot in this section)
- Fixes StarRocks/starrocks#66655
๐ก You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
[!NOTE] Add auto-detection/parsing of labeled column statistics in ColumnStatistic.buildFrom with extensive tests and a QueryDumper labeled-format test.
- Statistics/Parsing:
- Update
ColumnStatistic.buildFrom(String)to auto-detect and parse labeled format (MIN:/MAX:/NULLS:/ROS:/NDV:) in addition to the old numeric format; retain range/NDV validations.- Add
parseLabeledValue(...)helper for labeled extraction.- Tests:
- Add
QueryDumpDeserializerTestwith parameterized cases covering old/new formats, UNKNOWN/no type, invalid ranges, negative/zero NDV, scientific notation, and mixed-format deserialization.- Extend
QueryDumperTestto verify labeled statistics output when the session variable is enabled.Written by Cursor Bugbot for commit ff4d0ede253a3acaa80ba0405428591e9ae7c5c5. This will update automatically on new commits. Configure here.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.
:white_check_mark: murphyatwork
:x: Copilot
You have signed the CLA already but the status is still pending? Let us recheck it.
๐งช CI Insights
Here's what we observed from your CI run for ff4d0ede.
๐ข All jobs passed!
But CI Insights is watching ๐
@copilot refactor the test to parameterized test to reduce redundant code
@copilot refactor the test to parameterized test to reduce redundant code
Refactored the tests to use @ParameterizedTest with @MethodSource. This reduced redundant code by 60 lines (from 307 to 247 lines) while maintaining the same test coverage. The refactoring groups similar test cases together:
testParseBasicStatistics- tests both old and new formats with standard valuestestParseUnknownTypeStatistics- tests UNKNOWN type handling with 4 variantstestParseStatisticsWithNullFraction- tests null fraction handlingtestParseStatisticsWithInvalidRange- tests invalid range handlingtestParseStatisticsWithNegativeDistinctValues- tests negative NDV handlingtestParseStatisticsWithScientificNotation- tests scientific notationtestDeserializeQueryDumpWithStatistics- tests end-to-end deserialization
Commit: 0aaf3d9e
@cursor review
@cursor review
@Mergifyio rebase
rebase
โ Base branch update has failed
Git reported the following error:
Rebasing (1/5)
Rebasing (2/5)
Rebasing (3/5)
Auto-merging fe/fe-core/src/test/java/com/starrocks/sql/optimizer/QueryDumperTest.java
CONFLICT (content): Merge conflict in fe/fe-core/src/test/java/com/starrocks/sql/optimizer/QueryDumperTest.java
error: could not apply befcaca6dc... Add comprehensive unit and integration tests for label-formatted statistics
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply befcaca6dc... Add comprehensive unit and integration tests for label-formatted statistics
@copilot rebase latest main branch
@copilot rebase latest main branch
I've rebased the branch on the latest main (commit 7369a80e). The branch now has 4 commits cleanly applied on top of the current main:
1148cafd- Enhance QueryDumpDeserializer to support label-formatted statisticsa19b4b43- Add comprehensive unit and integration tests for label-formatted statistics473319d9- Refactor tests to use parameterized tests to reduce redundant code63229ada- fix fmt
The rebase resolved one merge conflict in QueryDumperTest.java where I kept both the new testDumpBeCoreStat() test from main and my testDumpQueryWithLabeledStatistics() test.
Note: The branch is ready to be force-pushed. Since I don't have direct git push permissions, you'll need to manually force-push this branch or configure the repository to allow me to force-push.
@cursor review
@mergifyio rebase
rebase
โ Branch has been successfully rebased
Quality Gate passed
Issues
11 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
[FE Incremental Coverage Report]
:x: fail : 1195 / 1494 (79.99%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/OperatorVisitor.java | 0 | 2 | 00.00% | [447, 451] |
| :large_blue_circle: | com/starrocks/sql/common/DebugOperatorTracer.java | 0 | 24 | 00.00% | [575, 576, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603] |
| :large_blue_circle: | com/starrocks/meta/SqlBlackList.java | 0 | 1 | 00.00% | [67] |
| :large_blue_circle: | com/starrocks/StarRocksFEServer.java | 0 | 1 | 00.00% | [126] |
| :large_blue_circle: | com/starrocks/rpc/LakeServiceWithMetrics.java | 0 | 2 | 00.00% | [193, 194] |
| :large_blue_circle: | com/starrocks/qe/scheduler/assignment/LocalFragmentAssignmentStrategy.java | 2 | 7 | 28.57% | [87, 88, 89, 90, 91] |
| :large_blue_circle: | com/starrocks/planner/NoopSink.java | 2 | 6 | 33.33% | [30, 35, 40, 45] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/physical/PhysicalFetchOperator.java | 15 | 27 | 55.56% | [54, 58, 78, 88, 89, 91, 92, 94, 95, 96, 97, 98] |
| :large_blue_circle: | com/starrocks/planner/LookUpNode.java | 9 | 16 | 56.25% | [55, 56, 57, 58, 59, 60, 61] |
| :large_blue_circle: | com/starrocks/connector/hive/HiveUtils.java | 48 | 84 | 57.14% | [42, 59, 60, 62, 86, 87, 88, 89, 90, 111, 122, 124, 126, 127, 132, 133, 134, 137, 139, 140, 145, 146, 149, 150, 151, 161, 162, 163, 165, 166, 167, 168, 187, 188, 189, 190] |
| :large_blue_circle: | com/starrocks/planner/FetchNode.java | 30 | 51 | 58.82% | [59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 72, 88, 89, 90, 91, 92, 93, 106, 107, 116, 121] |
| :large_blue_circle: | com/starrocks/system/SystemInfoService.java | 5 | 8 | 62.50% | [1540, 1541, 1542] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/physical/PhysicalLookUpOperator.java | 18 | 29 | 62.07% | [91, 101, 102, 104, 105, 107, 108, 109, 110, 111, 112] |
| :large_blue_circle: | com/starrocks/connector/RemoteFileOperations.java | 2 | 3 | 66.67% | [320] |
| :large_blue_circle: | com/starrocks/sql/optimizer/LateMaterializationRewriter.java | 356 | 503 | 70.78% | [162, 241, 244, 247, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 371, 390, 391, 401, 424, 425, 426, 427, 428, 430, 440, 441, 474, 475, 502, 503, 505, 506, 507, 508, 510, 511, 543, 544, 545, 547, 550, 552, 553, 554, 555, 556, 558, 559, 564, 565, 566, 568, 570, 571, 572, 573, 574, 576, 578, 583, 584, 585, 586, 591, 592, 593, 594, 596, 597, 598, 600, 601, 606, 607, 608, 610, 612, 614, 615, 616, 617, 618, 620, 621, 622, 627, 628, 629, 630, 631, 632, 633, 635, 636, 687, 688, 689, 690, 691, 692, 694, 696, 699, 700, 702, 704, 706, 707, 708, 709, 711, 713, 715, 716, 717, 718, 719, 720, 721, 723, 814, 815, 816, 817, 818, 819, 821, 822, 867, 868, 882] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rule/transformation/PruneHDFSScanColumnRule.java | 3 | 4 | 75.00% | [175] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/physical/PhysicalJoinOperator.java | 5 | 6 | 83.33% | [81] |
| :large_blue_circle: | com/starrocks/connector/partitiontraits/HivePartitionTraits.java | 7 | 8 | 87.50% | [73] |
| :large_blue_circle: | com/starrocks/sql/optimizer/OptimizerTraceUtil.java | 19 | 21 | 90.48% | [115, 129] |
| :large_blue_circle: | com/starrocks/qe/VariableMgr.java | 11 | 12 | 91.67% | [243] |
| :large_blue_circle: | com/starrocks/qe/StmtExecutor.java | 46 | 50 | 92.00% | [1038, 1051, 3214, 3215] |
| :large_blue_circle: | com/starrocks/sql/plan/PlanFragmentBuilder.java | 86 | 90 | 95.56% | [2549, 4145, 4146, 4455] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rewrite/scalar/ImplicitCastRule.java | 22 | 23 | 95.65% | [122] |
| :large_blue_circle: | com/starrocks/system/BackendResourceStat.java | 145 | 151 | 96.03% | [101, 102, 103, 104, 105, 358] |
| :large_blue_circle: | com/starrocks/sql/analyzer/ExpressionAnalyzer.java | 42 | 43 | 97.67% | [315] |
| :large_blue_circle: | com/starrocks/catalog/Table.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/lake/TabletRepairHelper.java | 58 | 58 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/MvRewritePreprocessor.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/qe/SessionVariable.java | 9 | 9 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/scalar/ScalarOperator.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/QueryOptimizer.java | 3 | 3 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/catalog/IcebergTable.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/qe/scheduler/slot/QueryQueueOptions.java | 8 | 8 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/DeletePlanner.java | 4 | 4 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/OptExpression.java | 6 | 6 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/common/Config.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rule/tree/JsonPathRewriteRule.java | 5 | 5 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rule/transformation/ListPartitionPruner.java | 15 | 15 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/UpdatePlanner.java | 5 | 5 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rewrite/scalar/FoldConstantsRule.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/catalog/ResourceGroupMgr.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/qe/scheduler/slot/PipelineDriverAllocator.java | 18 | 18 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/dump/QueryDumpSerializer.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/catalog/Column.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/LogicalPlanPrinter.java | 15 | 15 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/planner/PlanFragment.java | 5 | 5 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/planner/HiveTableSink.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/OperatorType.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/InsertPlanner.java | 5 | 5 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rule/tree/exprreuse/ScalarOperatorsReuse.java | 8 | 8 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/dump/QueryDumpDeserializer.java | 22 | 22 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/qe/GlobalVariable.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/scheduler/mv/MVRefreshPartitionSelector.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/analyzer/ResourceGroupAnalyzer.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/catalog/HiveTable.java | 8 | 8 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/planner/expression/ExprToThrift.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/catalog/FunctionSet.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/cost/HashJoinCostModel.java | 8 | 8 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/ScanOperatorPredicates.java | 7 | 7 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/lake/vacuum/FullVacuumDaemon.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rewrite/OptExternalPartitionPruner.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/statistic/StatisticExecutor.java | 6 | 6 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/rule/tree/lowcardinality/DecodeCollector.java | 4 | 4 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/statistics/ColumnStatistic.java | 20 | 20 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/system/ComputeNode.java | 4 | 4 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/LoadPlanner.java | 7 | 7 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/connector/hive/PartitionUpdate.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/dump/QueryDumpInfo.java | 22 | 22 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/operator/physical/PhysicalTopNOperator.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/catalog/TableProperty.java | 7 | 7 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/common/util/PropertyAnalyzer.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/qe/scheduler/assignment/FragmentAssignmentStrategyFactory.java | 1 | 1 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/analyzer/PolymorphicFunctionAnalyzer.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/OptExpressionVisitor.java | 2 | 2 | 100.00% | [] |
| :large_blue_circle: | com/starrocks/sql/optimizer/cost/feature/FeatureExtractor.java | 7 | 7 | 100.00% | [] |
[Java-Extensions Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[BE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
@cursor review
@Mergifyio backport branch-4.0
backport branch-4.0
โ Backports have been created
- #66843 [Enhancement] Add label-formatted statistics support to QueryDumpDeserializer (backport #66656) has been created for branch
branch-4.0