firebird Sub-optimal plan for nested loops: fullscan occurs if column name is not prefixed by alias when joining with USING() clause [CORE5040]

Submitted by: @pavel-zotov

Consider following script:

--create sequence g;
alter sequence g restart with 0;
recreate table t1(x int, y int);
recreate table t2(x int);

insert into t1
select mod(i,3), i
from (
    select gen_id(g,1) i
    from rdb$types a, rdb$types b, (select 1 i from rdb$types rows 20)
    union all
    select 1 from rdb$database where 1=0 -- "materialization": prevent from repeating evaluation gen_id() in the outer part of query
    rows 1000000
);

insert into t2
select mod(i, 20)
from (
    select gen_id(g,1) i
    from rdb$types
    union all
    select 1 from rdb$database where 1=0
    rows 20
);
commit;

create index t1_x_y on t1(x, y);
create index t2_x on t2(x);
commit;

This script will produce table T1 with 1'000'000 rows and T2 with 20 rows, and their index statistics will be:

SQL> set list on; select rdb$relation_name,rdb$index_name,rdb$statistics from rdb$indices where rdb$relation_name in ('T1','T2');

RDB$RELATION_NAME               T1
RDB$INDEX_NAME                  T1_X_Y
RDB$STATISTICS                  9\.999999974752427e-07

RDB$RELATION_NAME               T2
RDB$INDEX_NAME                  T2_X
RDB$STATISTICS                  0.05000000074505806

Now run these queries:

RUN-1

select count(*)
from t1 a join t2 b on a.x = b.x
where b.x = 0;

RUN-2

select count(*)
from t2 b left join t1 a using(x)
where x=0; -- yes, unqualified (neither  "a" nor "b" prefix here)

RUN-3

select count(*)
from t1 a join t2 b using(x)
where x = 0; -- yes, unqualified (neither  "a" nor "b" prefix here)

Then:

Compare plan of RUN-1 and RUN-2 & RUN-3: the 2nd and 3rd will use NATURAL reads which leads to poor performance
Compare performance statistics on 2.5 and 3.0: the former WINS

I got following data for mentioned queries:

RUN-1

WI-V2.5.5.26952:

select count(*)
from t1 a join t2 b on a.x = b.x
where b.x = 0

PLAN JOIN (B INDEX (T2_X), A INDEX (T1_X_Y))
1 records fetched
    919 ms, 13500 read(s), 667169 fetch(es)

Table                             Natural     Index
****************************************************
T1                                           333333
T2                                                1

WI-V3.0.0.32208:

select count(*)
from t1 a join t2 b on a.x = b.x
where b.x = 0

Select Expression
    -> Aggregate
        ->  Nested Loop Join (inner)
            -> Filter
                -> Table "T2" as "B" Access By ID
                    -> Bitmap
                        -> Index "T2_X" Range Scan (full match)
            -> Filter
                -> Table "T1" as "A" Access By ID
                    -> Bitmap
                        -> Index "T1_X_Y" Range Scan (partial match: 1/2)
1 records fetched
   1164 ms, 6702 read(s), 666918 fetch(es) -- elapsed time more than by 25% exceeds 2.5. Why ?

Table                             Natural     Index
****************************************************
T1                                           333333
T2                                                1

RUN-2

WI-V2.5.5.26952:

select count(*)
from t2 b left join t1 a using(x)
where x=0

PLAN JOIN (B NATURAL, A INDEX (T1_X_Y)) -- natural here is EXPECTED because of LEFT join
1 records fetched
   2779 ms, 40516 read(s), 2001616 fetch(es)

Table                             Natural     Index
****************************************************
T1                                          1000000
T2                                     20

WI-V3.0.0.32208:

select count(*)
from t2 b left join t1 a using(x)
where x=0

Select Expression
    -> Aggregate
        -> Filter
            ->  Nested Loop Join (outer)
                -> Table "T2" as "B" Full Scan
                -> Filter
                    -> Table "T1" as "A" Access By ID
                        -> Bitmap
                            -> Index "T1_X_Y" Range Scan (partial match: 1/2)
1 records fetched
   3651 ms, 20120 read(s), 2000840 fetch(es) -- elapsed time more than by 30% exceeds 2.5. Why ?

Table                             Natural     Index
****************************************************
T1                                          1000000
T2                                     20

RUN-3

WI-V2.5.5.26952:

select count(*)
from t1 a join t2 b using(x)
where x = 0

PLAN JOIN (A NATURAL, B INDEX (T2_X)) -- why NATURAL here ??
1 records fetched
   7304 ms, 13001 read(s), 6025990 fetch(es)

Table                             Natural     Index
****************************************************
T1                                1000000
T2                                          1000000

WI-V3.0.0.32208:

select count(*)
from t1 a join t2 b using(x)
where x = 0

Select Expression
    -> Aggregate
        ->  Nested Loop Join (inner)
            -> Table "T1" as "A" Full Scan -- why NATURAL here ??
            -> Filter
                -> Table "T2" as "B" Access By ID
                    -> Bitmap
                        -> Index "T2_X" Range Scan (full match)
1 records fetched
   9895 ms, 4420 read(s), 6012908 fetch(es) -- elapsed time here exceeds 2.5 more than by 35%. Why ?

Table                             Natural     Index
****************************************************
T1                                1000000
T2                                          1000000

PS. Adding prefix from alias in RUN-3:

select count(*)
from t1 a join t2 b using(x)
where b.x = 0;

turn index usage ON and plan will be: PLAN JOIN (B INDEX (T2_X), A INDEX (T1_X_Y)) -- i.e. most efficient for this query

PPS. This query:

select mod(i,3), i
from (
    select gen_id(g,1) i
    from rdb$types a, rdb$types b, (select 1 i from rdb$types rows 20)
    union all
    select 1 from rdb$database where 1=0 -- "materialization": prevent from repeating evaluation gen_id() in the outer part of query
    rows 1000000
);

DOES require rdb$types be aliased (as "a", "b") in 2.5 (otherwise get "alias RDB$TYPES conflicts with an alias in the same statement"), but not in 3.0. Is this expected behaviour?

Dec 09 '15 18:12 firebird-automations

Modified by: @dyemanov

summary: Performance degradation on nested loops when joining big table "A" and small "B" on indexed field 'X' which is STARTING part of compound index {"X", "Y"} of "A". Natural scans occur if "X" is not prefixed by alias on join with USING() clause. => Sub-optimal plan for nested loops: fullscan occurs if column name is not prefixed by alias when joining with USING() clause

Dec 10 '15 09:12 firebird-automations

Commented by: @dyemanov

I'm not sure this is a bug, more or less "as designed". An implicit join replaces X (unqualified) with COALESCE(A.X, B.X) -- this is a handy trick to workaround "ambigous column reference" error. But COALESCE(A.X, B.X) cannot be evaluated for a single stream (be it A or B), hence a fullscan. The optimizer is not aware of this trick and cannot detect that A.X is actually equvalent to B.X inside COALESCE. And I don't see how it can be improved.

Apr 03 '16 13:04 firebird-automations

firebird firebird copied to clipboard

Sub-optimal plan for nested loops: fullscan occurs if column name is not prefixed by alias when joining with USING() clause [CORE5040]

firebird
firebird copied to clipboard