gporca icon indicating copy to clipboard operation
gporca copied to clipboard

Long plan generation utilising all available memory

Open darthunix opened this issue 5 years ago • 1 comments

Building an ORCA plan for query below utilises all available memory and lasts for minutes. The problem affects only ORCA (up to 3.90.1 - current master) and doesn't affect PostgreSQL planner.

create table t (a int) distributed by (a);

set optimizer = on;

explain select a in (
    select a from t as t1 where a in (
        select a from t as t2 where a in (
            select a from t as t3 where a in (
                select a from t as t4 join t as t5 using(a) group by t4.a  --Time: 1277,613 ms
                union
                select a from t as t4 join t as t5 using(a) group by t4.a  --Time: 15593,377 ms
                union
                select a from t as t4 join t as t5 using(a) group by t4.a  --Time: 116124,015 ms
            )
        )
    )
) from t;

An interesting aspect is that every "union" makes a query last about 10 times longer. 1 select - 1,2 sec 2 selects - 15,6 sec 3 selects - 116,1 sec

The flame graph reports that the the most CPU resources are utilised somewhere in Transform->CLogicalLeftOuterApply->...->DeriveOuterReferences function stack (but the reason of this problem is not clear to me).

darthunix avatar Feb 13 '20 03:02 darthunix

If you set the guc optimizer_print_optimization_stats=on, you can see which transforms and the number of bindings/alternatives generated.

CXformProject2ComputeScalar: 15 calls, 15 total bindings, 15 alternatives generated, 0ms
CXformExpandNAryJoin: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformExpandNAryJoinMinCard: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformExpandNAryJoinDP: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformGet2TableScan: 10 calls, 10 total bindings, 10 alternatives generated, 0ms
CXformSelect2Filter: 11 calls, 11 total bindings, 11 alternatives generated, 0ms
CXformSimplifySelectWithSubquery: 3 calls, 3 total bindings, 9 alternatives generated, 0ms
CXformSimplifyProjectWithSubquery: 1 calls, 1 total bindings, 3 alternatives generated, 0ms
CXformSelect2Apply: 6 calls, 6 total bindings, 12 alternatives generated, 0ms
CXformProject2Apply: 2 calls, 2 total bindings, 4 alternatives generated, 0ms
CXformInnerJoin2NLJoin: 16 calls, 16 total bindings, 16 alternatives generated, 0ms
CXformInnerJoin2HashJoin: 16 calls, 16 total bindings, 16 alternatives generated, 0ms
CXformLeftOuterApply2LeftOuterJoin: 5 calls, 433071 total bindings, 136085 alternatives generated, 35651ms
CXformImplementLeftOuterCorrelatedApply: 5 calls, 5 total bindings, 5 alternatives generated, 0ms
CXformLeftOuterJoin2NLJoin: 1 calls, 1 total bindings, 1 alternatives generated, 0ms
CXformLeftSemiJoin2NLJoin: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformLeftSemiJoin2HashJoin: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformGbAgg2HashAgg: 24 calls, 24 total bindings, 24 alternatives generated, 0ms
CXformGbAgg2StreamAgg: 30 calls, 30 total bindings, 30 alternatives generated, 0ms
CXformGbAgg2ScalarAgg: 4 calls, 4 total bindings, 4 alternatives generated, 0ms
CXformUnion2UnionAll: 1 calls, 1 total bindings, 1 alternatives generated, 0ms
CXformImplementUnionAll: 1 calls, 1 total bindings, 1 alternatives generated, 0ms
CXformJoinCommutativity: 8 calls, 8 total bindings, 8 alternatives generated, 0ms
CXformLeftSemiJoin2InnerJoin: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformSimplifyGbAgg: 1 calls, 1 total bindings, 1 alternatives generated, 0ms
CXformPushGbBelowJoin: 2 calls, 4 total bindings, 2 alternatives generated, 0ms
CXformSplitGbAgg: 10 calls, 10 total bindings, 10 alternatives generated, 0ms
CXformImplementLeftSemiCorrelatedApplyIn: 3 calls, 3 total bindings, 3 alternatives generated, 0ms
CXformLeftSemiApplyIn2LeftSemiJoin: 3 calls, 23345 total bindings, 9075 alternatives generated, 2131ms
CXformExpandNAryJoinGreedy: 3 calls, 3 total bindings, 3 alternatives generated, 0ms

CXformLeftOuterApply2LeftOuterJoin and CXformLeftSemiApplyIn2LeftSemiJoin are getting binded and generating way too many alternatives in this case.

https://github.com/greenplum-db/gporca/commit/80b960871a7976931961025a3df8e9a8868b0d71 and https://github.com/greenplum-db/gporca/commit/7554f88f31a85d731402666450be14883674d802 looks like they attempt to fix similar issues.

Some other helpful gucs that we use when debugging issues like this are optimizer_print_xform, optimizer_print_xform_results, optimizer_print_memo_after_optimization, and optimizer_print_optimization_context (these can be very verbose).

chrishajas avatar Feb 13 '20 18:02 chrishajas