[core] Support convert TopN to limit for primary-key table in deletion-vector mode
Purpose
The primary-keys is NOT NULL, unique and natural sort by ascending, when deletion-vector enabled, the following rules can let TopN convert to limit.
- The preceding sort keys must matches with the primary keys in order, and the sort direction must be the same.
- If non-primary key including, all the primary-key must matches in order first.
e.g. partition keys: pt. trimmed primary keys: c1, c2. non-primary keys: n1
create table T (
c1 int not null,
c2 int not null,
n1 int,
pt int not null
PRIMARY KEY(pt, c1, c2)
) WITH (
'deletion-vector-enabled' = 'true'
);
TopN without partition keys:
| position | c1 | c2 | n1 |
|---|---|---|---|
| 0 | 10 | 100 | 2 |
| 1 | 10 | 200 | 1 |
| 2 | 20 | 100 | 2 |
| 3 | 20 | 200 | 1 |
-- return position [0]
ORDER BY c1 LIMIT 1
ORDER BY c1, c2 LIMIT 1
-- return position [3]
ORDER BY c1 DESC LIMIT 1
ORDER BY c1 DESC, c2 DESC LIMIT 1
-- non-primary key `n1` can be ignored, if all the primary keys matches (because primary keys is unique)
-- return position [0]
ORDER BY c1, c2, n1 ASC LIMIT 1
ORDER BY c1, c2, n1 DESC LIMIT 1
-- return position [3]
ORDER BY c1 DESC, c2 DESC, n1 ASC LIMIT 1
ORDER BY c1 DESC, c2 DESC, n1 DESC LIMIT 1
-- this following example will not convert TopN to limit.
-- not match, sort direction not same as the primary keys's
ORDER BY c1, c2 DESC LIMIT 1
ORDER BY c1 DESC, c2 LIMIT 1
-- not match, if non-primary key including, all the trimmed primary keys must matches in order
ORDER BY c1, n1 LIMIT 1
TopN with partition keys:
-- not match, not supported at this time.
ORDER BY pt, c1, c2 LIMIT 1
-- match, only trimmed primary key.
WHERE pt=1 ORDER BY c1, c2 LIMIT 1
Tests
API and Format
Documentation
I remember that when DV is turned on, the returns are out of order, not sorted by primary key.
I remember that when DV is turned on, the returns are out of order, not sorted by primary key.
Why is it out of order? IMO, it will sort by the primary keys in a single DataFile, and using the deletion-vector to mark its deleted row id.
The mainly idea on this PR is to convert the TopN primary keys predicate into limit predicate when reading a single DataFile, then the compute engine (e.g. Apache Spark) will do the Global TopN.
I remember that when DV is turned on, the returns are out of order, not sorted by primary key.
Why is it out of order? IMO, it will sort by the primary keys in a single DataFile, and using the deletion-vector to mark its deleted row id.
The mainly idea on this PR is to convert the TopN primary keys predicate into limit predicate when reading a single DataFile, then the compute engine (e.g. Apache Spark) will do the Global TopN.
I see, for single data file, it is true.
Also, can you add benchmark for this?
Running benchmark: limit
Running case: pushdown-disable-1
Stopped after 1 iterations, 327 ms
Running case: pushdown-enable-1
Stopped after 1 iterations, 17 ms
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Mac OS X 15.6.1
Apple M4 Pro
limit: Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
OPERATORTEST_limit_pushdown-disable-1 328 / 328 4573.5 218.6 1.0X
OPERATORTEST_limit_pushdown-enable-1 18 / 18 84730.0 11.8 18.5X
Running benchmark: TopN
Running case: pushdown-disable-1
Stopped after 1 iterations, 1224 ms
Running case: pushdown-enable-1
Stopped after 1 iterations, 71 ms
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Mac OS X 15.6.1
Apple M4 Pro
TopN: Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
OPERATORTEST_TopN_pushdown-disable-1 1224 / 1224 1225.2 816.2 1.0X
OPERATORTEST_TopN_pushdown-enable-1 72 / 72 20919.2 47.8 17.1X
@JingsongLi Very thanks for your feedback! Sorry that I'm busy these days. I will update next week.
@JingsongLi Very thanks for your feedback! Sorry that I'm busy these days. I will update next week.
Sure, please feel free to update according to your schedule.