gpdb
gpdb copied to clipboard
Motion nodes and loci should not be assigned in utility mode
Greenplum version or build
master as of 992cc6f8e83, did not test on 6X.
Expected behavior
In a psql session connected in utility mode to a segment postmaster, queries such as select * from pg_stat_user_tables;
should not cause assertion failure or crash when being planned.
Actual behavior
PGOPTIONS='-c gp_role=utility' psql -d postgres -p 7002
postgres=# select * from pg_stat_user_tables where relname='test';
FATAL: Unexpected internal error (nodeMotion.c:622)
DETAIL: FailedAssertion("!(node->motionID > 0)", File: "nodeMotion.c", Line: 622)
Step to reproduce the behavior
PGOPTIONS='-c gp_role=utility' psql -d postgres -p 7002
postgres=# select * from pg_stat_user_tables where relname='test';
Stack trace from the core:
(lldb) bt
* thread #1, stop reason = signal SIGSTOP
* f #0:libsystem_kernel.dylib __pthread_kill
f #1:libsystem_pthread.dylib pthread_kill
f #2:libsystem_c.dylib abort
f #3:postmaster ExceptionalCondition(conditionName="!(node->motionID > 0)", errorType="FailedAssertion", fileName="nodeMotion.c", lineNumber=622) at assert.c:66
f #4:postmaster ExecInitMotion(node=0x00007fc8a01d7b48, estate=0x00007fc89e01fa50, eflags=16) at nodeMotion.c:622
f #5:postmaster ExecInitNode(node=0x00007fc8a01d7b48, estate=0x00007fc89e01fa50, eflags=16) at execProcnode.c:438
f #6:postmaster ExecInitAppend(node=0x00007fc8a01dc5b8, estate=0x00007fc89e01fa50, eflags=16) at nodeAppend.c:170
f #7:postmaster ExecInitNode(node=0x00007fc8a01dc5b8, estate=0x00007fc89e01fa50, eflags=16) at execProcnode.c:227
f #8:postmaster ExecInitHashJoin(node=0x00007fc8a01e0430, estate=0x00007fc89e01fa50, eflags=16) at nodeHashjoin.c:636
f #9:postmaster ExecInitNode(node=0x00007fc8a01e0430, estate=0x00007fc89e01fa50, eflags=16) at execProcnode.c:368
f #10:postmaster InitPlan(queryDesc=0x00007fc89e00b650, eflags=16) at execMain.c:1984
f #11:postmaster standard_ExecutorStart(queryDesc=0x00007fc89e00b650, eflags=16) at execMain.c:486
f #12:postmaster ExecutorStart(queryDesc=0x00007fc89e00b650, eflags=0) at execMain.c:210
f #13:postmaster PortalStart(portal=0x00007fc89e01d850, params=0x0000000000000000, eflags=0, snapshot=0x0000000000000000, ddesc=0x0000000000000000) at pquery.c:709
f #14:postmaster exec_simple_query(query_string="select * from pg_stat_user_tables where relname='test';") at postgres.c:1739
f #15:postmaster PostgresMain(argc=1, argv=0x00007fc89f00ac68, dbname="postgres", username="apraveen") at postgres.c:5082
f #16:postmaster BackendRun(port=0x00007fc89d404080) at postmaster.c:4793
f #17:postmaster BackendStartup(port=0x00007fc89d404080) at postmaster.c:4461
f #18:postmaster ServerLoop at postmaster.c:1951
f #19:postmaster PostmasterMain(argc=5, argv=0x00007fc89b40abe0) at postmaster.c:1521
f #20:postmaster main(argc=5, argv=0x00007fc89b40abe0) at main.c:240
This quick change seems to fix the issue but it could introduce new regressions. I don't want to check that now, let me note the change here instead.
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1599,7 +1599,7 @@ set_append_path_locus(PlannerInfo *root, Path *pathnode, RelOptInfo *rel,
*subpaths_out = NIL;
/* If no subpath, any worker can execute this Append. Result has 0 rows. */
- if (!subpaths)
+ if (!subpaths || Gp_role != GP_ROLE_DISPATCH)
{
CdbPathLocus_MakeGeneral(&pathnode->locus);
return;
@kainwen can you check ?
I simplify the reproduce step as follows:
create view test as select C.oid AS relid, C.relname AS relname FROM pg_class C ;
PGOPTIONS='-c gp_role=utility' psql -d postgres -p 7002 SELECT allt.relid, allt.relname FROM gp_dist_random('test') allt inner join pg_class c on allt.relid = c.oid;
FATAL: Unexpected internal error (assert.c:44) DETAIL: FailedAssertion("!(node->motionID > 0)", File: "nodeMotion.c", Line: 683)
RCA: test is a view and it is the param of gp_dist_random, we call ApplyRetrieveRule to rewrite RangeTblEntry of test, change rte->rtekind to RTE_SUBQUERY, and forceDistRand of rte is true.
in set_subquery_pathlist set the locus of subquery is strewn
foreach(lc, sub_final_rel->pathlist) { Path *subpath = (Path *) lfirst(lc); Path *path; List *l; List *pathkeys; CdbPathLocus locus;
if (forceDistRand)
CdbPathLocus_MakeStrewn(&locus, getgpsegmentCount());
else
locus = cdbpathlocus_from_subquery(root, rel, subpath);
}
so we add a motion for pg_class in cdbpath_motion_for_join. however, in utility mode we should not has motion node and hit assert panic.
@zxuejing @huansong will this https://github.com/greenplum-db/gpdb/commit/578bd4236880f58272b3a291f5cb659208e3397a fix the issue?
Yes tested that https://github.com/greenplum-db/gpdb/commit/578bd4236880f58272b3a291f5cb659208e3397a has fixed the error cases here. Though the error behavior in #15238 and this one are different but I think the root cause should be the same. @zxuejing could you please check if we can close this issue?
Hi @huansong, @kainwen the RCA of #15238 and this issue is the same. I will close this issue.