matrixone
matrixone copied to clipboard
[Tech Request]: delay Ranges for partitioned table
Is there an existing issue for the same tech request?
- [X] I have checked the existing issues.
Does this tech request not affect user experience?
- [X] This tech request doesn't affect user experience.
What would you like to be added ?
Ranges
should be called as late as handling runtime filters if they exist. This mechanism is already done for tables without partitions. It should also be implemented for partitioned tables.
Why is this needed ?
DML plan has a bug in main (and the tagged branches). For prepared statements like
prepare stmt from insert into t1 values (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?)
whose parameter list is longer than 3 rows, the deduplication plan gives a BlockFilter cpk in (null, null, null, null)
, because it generates the filter from a compile-time null vector. In theory it should fail deduplication, i.e., it will never detect any deduplicated keys. However, current Reader code drops any pk filter with null value. These two bugs annihilate each other. It doesn't affects correctness of result, so it's not discovered before.
For performance's sake, we don't fix that bug directly. We should remove that falsely early folded BlockFilter, and use runtime filter instead. But the fix depends on this task. Without it, partitioned table should be slower whether it's prepared or not.
Additional information
related to #16178