tidb icon indicating copy to clipboard operation
tidb copied to clipboard

planner: the optimizer can't produce an optimal plan for queries with sub-queries in select list since it always decorrelate sub-queries

Open qw4990 opened this issue 4 months ago • 1 comments

Enhancement

See the case below:

create table t_small (a int, b int);
create table t_big (a int, b int, key(a));

explain select * from t_small join (select *, row_number() over (partition by t_big.a) as rn from t_big) t_big on t_small.a=t_big.a;
+----------------------------------------+----------+-----------+-------------------------+--------------------------------------------------------------------------------------------------+
| id                                     | estRows  | task      | access object           | operator info                                                                                    |
+----------------------------------------+----------+-----------+-------------------------+--------------------------------------------------------------------------------------------------+
| HashJoin_12                            | 12487.50 | root      |                         | inner join, equal:[eq(wdhis.t_small.a, wdhis.t_big.a)]                                           |
| ├─Window_17(Build)                     | 9990.00  | root      |                         | row_number()->Column#8 over(partition by wdhis.t_big.a rows between current row and current row) |
| │ └─Projection_21                      | 9990.00  | root      |                         | wdhis.t_big.a, wdhis.t_big.b                                                                     |
| │   └─IndexLookUp_20                   | 9990.00  | root      |                         |                                                                                                  |
| │     ├─IndexFullScan_18(Build)        | 9990.00  | cop[tikv] | table:t_big, index:a(a) | keep order:true, stats:pseudo                                                                    |
| │     └─TableRowIDScan_19(Probe)       | 9990.00  | cop[tikv] | table:t_big             | keep order:false, stats:pseudo                                                                   |
| └─TableReader_16(Probe)                | 9990.00  | root      |                         | data:Selection_15                                                                                |
|   └─Selection_15                       | 9990.00  | cop[tikv] |                         | not(isnull(wdhis.t_small.a))                                                                     |
|     └─TableFullScan_14                 | 10000.00 | cop[tikv] | table:t_small           | keep order:false, stats:pseudo                                                                   |
+----------------------------------------+----------+-----------+-------------------------+--------------------------------------------------------------------------------------------------+

A better plan could be (how Oracle executes it):

Apply(NestedLoopJoin)
    TableFullScan[build] (t_small)
    IndexRangeScan[probe] (t_big)

Then we can leverage the index to access t_big to avoid FullScan.

qw4990 avatar Feb 18 '24 08:02 qw4990