starrocks
starrocks copied to clipboard
[BugFix] Fix mv refresh bug for queries with union all
Why I'm doing:
MV with union all 's result is not correct if
CREATE TABLE `t1` (
`k1` date not null,
`k2` datetime not null,
`k3` char(20),
`k4` varchar(20),
`k5` boolean,
`k6` tinyint,
`k7` smallint,
`k8` int,
`k9` bigint,
`k10` largeint,
`k11` float,
`k12` double,
`k13` decimal(27,9) )
DUPLICATE KEY(`k1`, `k2`, `k3`, `k4`, `k5`)
PARTITION BY date_trunc('day', k1)
DISTRIBUTED BY HASH(`k1`, `k2`, `k3`) BUCKETS 3
PROPERTIES ("storage_format" = "v2" );
INSERT INTO t1 VALUES
('2020-10-22','2020-10-23 12:12:12','k3','k4',0,1,2,3,4,5,1.1,1.12,2.889),
('2020-10-23','2020-10-24 12:12:12','k3','k4',0,0,2,3,4,5,1.1,1.12,2.889),
('2020-10-24','2020-10-25 12:12:12','k3','k4',0,1,2,3,4,5,1.1,1.12,2.889);
CREATE TABLE `t2` (
`k1` date not null,
`k2` datetime not null,
`k3` char(20),
`k4` varchar(20),
`k5` boolean,
`k6` tinyint,
`k7` smallint,
`k8` int,
`k9` bigint,
`k10` largeint,
`k11` float,
`k12` double,
`k13` decimal(27,9) )
DUPLICATE KEY(`k1`, `k2`, `k3`, `k4`, `k5`)
PARTITION BY date_trunc('day', k1)
DISTRIBUTED BY HASH(`k1`, `k2`, `k3`) BUCKETS 3
PROPERTIES ("storage_format" = "v2" );
INSERT INTO t2 VALUES
('2020-10-10','2020-10-23 12:12:12','k3','k4',0,1,2,3,4,5,1.1,1.12,2.889),
('2020-10-11','2020-10-24 12:12:12','k3','k4',0,0,2,3,4,5,1.1,1.12,2.889),
('2020-10-12','2020-10-25 12:12:12','k3','k4',0,1,2,3,4,5,1.1,1.12,2.889),
('2020-10-21','2020-10-24 12:12:12','k3','k4',0,0,2,3,4,5,1.1,1.12,2.889),
('2020-10-22','2020-10-25 12:12:12','k3','k4',0,1,2,3,4,5,1.1,1.12,2.889);
CREATE MATERIALIZED VIEW IF NOT EXISTS test_mv1
PARTITION BY date_trunc('day', `k1`)
DISTRIBUTED BY HASH(`k1`)
REFRESH ASYNC
as
select * from t1
union all
select * from t2;
mysql> select k1, count(1) from t1 group by k1 order by k1;
+------------+----------+
| k1 | count(1) |
+------------+----------+
| 2020-10-22 | 1 |
| 2020-10-23 | 1 |
| 2020-10-24 | 1 |
+------------+----------+
3 rows in set (0.46 sec)
mysql> select k1, count(1) from t2 group by k1 order by k1;
+------------+----------+
| k1 | count(1) |
+------------+----------+
| 2020-10-10 | 1 |
| 2020-10-11 | 1 |
| 2020-10-12 | 1 |
| 2020-10-21 | 1 |
| 2020-10-22 | 1 |
+------------+----------+
5 rows in set (0.53 sec)
mysql> select k1, count(1) from test_mv1 group by k1 order by k1;
+------------+----------+
| k1 | count(1) |
+------------+----------+
| 2020-10-22 | 2 |
+------------+----------+
1 row in set (0.30 sec)
What I'm doing:
PR(https://github.com/StarRocks/starrocks/pull/42950/files) introduce a bug: we should not calculate partition diff by using total ref base tables partition and mv's partitions instead of merging each diffs because the diff calculate will delete all partitions for each base table's input and mv's partitions which will introduce the bug.
for (Map.Entry<Table, Column> entry : partitionTableAndColumn.entrySet()) {
Table refBaseTable = entry.getKey();
Column refBaseTablePartitionColumn = entry.getValue();
// Collect the ref base table's partition range map.
refBaseTablePartitionMap.put(refBaseTable, PartitionUtil.getPartitionKeyRange(
refBaseTable, refBaseTablePartitionColumn, partitionExpr));
// To solve multi partition columns' problem of external table, record the mv partition name to all the same
// partition names map here.
if (!refBaseTable.isNativeTableOrMaterializedView()) {
refBaseTableMVPartitionMap.put(refBaseTable,
PartitionUtil.getMVPartitionNameMapOfExternalTable(refBaseTable,
refBaseTablePartitionColumn, PartitionUtil.getPartitionNames(refBaseTable)));
}
Column partitionColumn = (materializedView.getPartitionInfo()).getPartitionColumns().get(0);
PartitionDiffer differ = PartitionDiffer.build(materializedView, partitionRange);
rangePartitionDiffList.add(PartitionUtil.getPartitionDiff(partitionExpr, partitionColumn,
refBaseTablePartitionMap.get(refBaseTable), mvRangePartitionMap, differ));
}
Fixes #issue
What type of PR is this:
- [x] BugFix
- [ ] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [x] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [x] 3.3
- [ ] 3.2
- [ ] 3.1
- [ ] 3.0
- [ ] 2.5
@mergify rebase
rebase
✅ Branch has been successfully rebased
@mergify rebase
rebase
✅ Nothing to do for rebase action
@mergify rebase
rebase
✅ Branch has been successfully rebased
Quality Gate passed
Issues
18 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
[BE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[FE Incremental Coverage Report]
:white_check_mark: pass : 62 / 69 (89.86%)
file detail
path | covered_line | new_line | coverage | not_covered_line_detail | |
---|---|---|---|---|---|
:large_blue_circle: | com/starrocks/scheduler/PartitionBasedMvRefreshProcessor.java | 5 | 7 | 71.43% | [1253, 1259] |
:large_blue_circle: | com/starrocks/sql/common/PartitionDiffer.java | 42 | 47 | 89.36% | [313, 314, 318, 319, 320] |
:large_blue_circle: | com/starrocks/sql/common/MvPartitionDiffResult.java | 6 | 6 | 100.00% | [] |
:large_blue_circle: | com/starrocks/sql/common/SyncPartitionUtils.java | 6 | 6 | 100.00% | [] |
:large_blue_circle: | com/starrocks/server/LocalMetastore.java | 1 | 1 | 100.00% | [] |
:large_blue_circle: | com/starrocks/catalog/MvRefreshArbiter.java | 1 | 1 | 100.00% | [] |
:large_blue_circle: | com/starrocks/sql/common/RangePartitionDiff.java | 1 | 1 | 100.00% | [] |
@Mergifyio backport branch-3.3
backport branch-3.3
✅ Backports have been created
-
#45003 [BugFix] Fix mv refresh bug for queries with union all (backport #44629) has been created for branch
branch-3.3
but encountered conflicts