gpdb
gpdb copied to clipboard
Fixed bug in legacy optimizer convert_EXPR_to_join (#14107) :
In GPDB, expression subquery is promoted to join, e.g. select * from T where a > (select 10*avg(x) from R where T.b=R.y); in postgres, the plan is:
QUERY PLAN
-----------------------------------------------------------------
Seq Scan on t (cost=0.00..80364.30 rows=753 width=8)
Filter: ((a)::numeric > (SubPlan 1))
SubPlan 1
-> Aggregate (cost=35.53..35.54 rows=1 width=32)
-> Seq Scan on r (cost=0.00..35.50 rows=10 width=4)
Filter: (t.b = y)
while in GPDB, the subquery is promoted to join:
QUERY PLAN
---------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3)
-> Hash Join (cost=477.00..969.11 rows=9567 width=8)
Hash Cond: (t.b = "Expr_SUBQUERY".csq_c0)
Join Filter: ((t.a)::numeric > "Expr_SUBQUERY".csq_c1)
-> Seq Scan on t (cost=0.00..321.00 rows=28700 width=8)
-> Hash (cost=472.83..472.83 rows=333 width=36)
-> Subquery Scan on "Expr_SUBQUERY"
-> HashAggregate
Group Key: r.y
-> Seq Scan on r
Optimizer: Postgres query optimizer
But if the T.b and R.y are different type, it will report error like: ERROR: operator 37 is not a valid ordering operator (pathkeys.c:579)
Because convert_EXPR_to_join will pass the sort operator and equal operator to the upper-level optimizer. For example, mergejoin requires sort, but the sort operator with inconsistent left and right types is invalid, so an error will be reported.
Fix https://github.com/greenplum-db/gpdb/issues/14107
Thanks for your PR and the reproduce SQL in your script.
I have played with the issue for just a little time, and give the following patch (just demo play patch) to get a join plan:
diff --git a/src/backend/cdb/cdbsubselect.c b/src/backend/cdb/cdbsubselect.c
index 4b2f1008179..b0b35b9a6dc 100644
--- a/src/backend/cdb/cdbsubselect.c
+++ b/src/backend/cdb/cdbsubselect.c
@@ -424,6 +424,9 @@ SubqueryToJoinWalker(Node *node, ConvertSubqueryToJoinContext *context)
{
TargetEntry *tle;
SortGroupClause *gc;
+ Oid eqOp = InvalidOid;
+ Oid sortOp = InvalidOid;
+ bool hashable = false;
tle = makeTargetEntry(innerExpr,
list_length(context->targetList) + 1,
@@ -432,6 +435,10 @@ SubqueryToJoinWalker(Node *node, ConvertSubqueryToJoinContext *context)
tle->ressortgroupref = list_length(context->targetList) + 1;
context->targetList = lappend(context->targetList, tle);
+ get_sort_group_operators(exprType((Node *) tle->expr),
+ false, true, false,
+ &sortOp, &eqOp, NULL, &hashable);
+
gc = makeNode(SortGroupClause);
gc->tleSortGroupRef = list_length(context->groupClause) + 1;
gc->eqop = eqOp;
The idea is:
we want to rewrite the SQL to
-- select * from T where a > (select 10*avg(x) from R where T.b=R.y);
select T.* from T (semi join)
(
select R.y, 10*avg(x) s from R group by R.y
) RR
where T.a > RR.s;
So @lmzzzzz1 several advice from me to continue to study and fix the issue:
- Look at and debug into the above rewrited-SQL to see what should be in the groupclause
- read and debug code in
convert_EXPR_to_join
-->SubqueryToJoinWalker
and see what should be there - PR should be with test cases, you can refer to other commits with test cases
If you have any questions, feel free to discuss here or ping me see if we can help together fix the bug.
Thanks again!
Thanks for your PR and the reproduce SQL in your script.
I have played with the issue for just a little time, and give the following patch (just demo play patch) to get a join plan:
diff --git a/src/backend/cdb/cdbsubselect.c b/src/backend/cdb/cdbsubselect.c index 4b2f1008179..b0b35b9a6dc 100644 --- a/src/backend/cdb/cdbsubselect.c +++ b/src/backend/cdb/cdbsubselect.c @@ -424,6 +424,9 @@ SubqueryToJoinWalker(Node *node, ConvertSubqueryToJoinContext *context) { TargetEntry *tle; SortGroupClause *gc; + Oid eqOp = InvalidOid; + Oid sortOp = InvalidOid; + bool hashable = false; tle = makeTargetEntry(innerExpr, list_length(context->targetList) + 1, @@ -432,6 +435,10 @@ SubqueryToJoinWalker(Node *node, ConvertSubqueryToJoinContext *context) tle->ressortgroupref = list_length(context->targetList) + 1; context->targetList = lappend(context->targetList, tle); + get_sort_group_operators(exprType((Node *) tle->expr), + false, true, false, + &sortOp, &eqOp, NULL, &hashable); + gc = makeNode(SortGroupClause); gc->tleSortGroupRef = list_length(context->groupClause) + 1; gc->eqop = eqOp;
The idea is:
we want to rewrite the SQL to
-- select * from T where a > (select 10*avg(x) from R where T.b=R.y); select T.* from T (semi join) ( select R.y, 10*avg(x) s from R group by R.y ) RR where T.a > RR.s;
So @lmzzzzz1 several advice from me to continue to study and fix the issue:
- Look at and debug into the above rewrited-SQL to see what should be in the groupclause
- read and debug code in
convert_EXPR_to_join
-->SubqueryToJoinWalker
and see what should be there- PR should be with test cases, you can refer to other commits with test cases
If you have any questions, feel free to discuss here or ping me see if we can help together fix the bug.
Thanks again!
@kainwen Sorry for taking so long to reply, thank you for your fix/debug suggestion, I updated a version of the commit, and added a test case to ensure that the join plan is generated.
Thanks. Will spend time in reviewing this later.
Hi, thanks for your contribution!
After playing with your patch, I think a better approach to fix the issue is using the innerExpr
's type to determine the sortOp
and eqOp
.
-
When generating path key for
SortGroupClause
,get_ordering_op_properties()
requires the type of the left and right operands being consistent.See: https://github.com/greenplum-db/gpdb/blob/815a602bda0ca52e88ff048b112e99f801917114/src/backend/utils/cache/lsyscache.c#L248-L261
-
SortGroupClause->tleSortGroupRef
is referencing the target entry created frominnerExpr
. Hence,SortGroupClause->eqop
andSortGroupClause->sortop
should be determined according toinnerExpr
's type.
Below is an updated patch.
diff --git a/src/backend/cdb/cdbsubselect.c b/src/backend/cdb/cdbsubselect.c
index 4b2f100817..8bcc97f68b 100644
--- a/src/backend/cdb/cdbsubselect.c
+++ b/src/backend/cdb/cdbsubselect.c
@@ -161,8 +161,7 @@ static bool
IsCorrelatedEqualityOpExpr(OpExpr *opexp, Expr **innerExpr, Oid *eqOp, Oid *sortOp, bool *hashable)
{
Oid opfamily;
- Oid ltype;
- Oid rtype;
+ Oid innerExprType;
List *l;
Assert(opexp);
@@ -171,11 +170,15 @@ IsCorrelatedEqualityOpExpr(OpExpr *opexp, Expr **innerExpr, Oid *eqOp, Oid *sort
Assert(eqOp);
Assert(sortOp);
+ if (!IsCorrelatedOpExpr(opexp, innerExpr))
+ return false;
+
+ innerExprType = exprType((Node *)*innerExpr);
/*
* If this is an expression of the form a = b, then we want to know about
* the vars involved.
*/
- if (!op_mergejoinable(opexp->opno, exprType(linitial(opexp->args))))
+ if (!op_mergejoinable(opexp->opno, innerExprType))
return false;
/*
@@ -190,24 +193,19 @@ IsCorrelatedEqualityOpExpr(OpExpr *opexp, Expr **innerExpr, Oid *eqOp, Oid *sort
list_free(l);
/*
- * Look up the correct sort operator from the chosen opfamily.
+ * Look up the correct equility/sort operators from the chosen opfamily.
*/
- ltype = exprType(linitial(opexp->args));
- rtype = exprType(lsecond(opexp->args));
- *eqOp = get_opfamily_member(opfamily, ltype, rtype, BTEqualStrategyNumber);
+ *eqOp = get_opfamily_member(opfamily, innerExprType, innerExprType, BTEqualStrategyNumber);
if (!OidIsValid(*eqOp)) /* should not happen */
elog(ERROR, "could not find member %d(%u,%u) of opfamily %u",
- BTEqualStrategyNumber, ltype, rtype, opfamily);
+ BTEqualStrategyNumber, innerExprType, innerExprType, opfamily);
- *sortOp = get_opfamily_member(opfamily, ltype, rtype, BTLessStrategyNumber);
+ *sortOp = get_opfamily_member(opfamily, innerExprType, innerExprType, BTLessStrategyNumber);
if (!OidIsValid(*sortOp)) /* should not happen */
elog(ERROR, "could not find member %d(%u,%u) of opfamily %u",
- BTLessStrategyNumber, ltype, rtype, opfamily);
+ BTLessStrategyNumber, innerExprType, innerExprType, opfamily);
- *hashable = op_hashjoinable(*eqOp, ltype);
-
- if (!IsCorrelatedOpExpr(opexp, innerExpr))
- return false;
+ *hashable = op_hashjoinable(*eqOp, innerExprType);
return true;
}
Thoughts?
Thanks for the reply, it seems that modifying IsCorrelatedEqualityOpExpr directly is a better fix, I updated a version of the commit
Hi, @kainwen @SmartKeyerror Could you please take another look? I think we can merge this PR.
@higuoxing That looks good to me.