gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

query crash when contain multiple shareinput in qd slice

Open yanwr1 opened this issue 1 year ago • 1 comments
trafficstars

Bug Report

query crashes due to assertion failure:

explain  with abc as (select sum(a)/sum(b) from foo ), c1 as (select sum(a)/sum(b) as col1 from bar) select  (select 1 from abc where f1.b = f2.b limit 1), COALESCE((select 2 from abc where f1.a=random() and f1.a=2), 0), (select 1 from c1 where col1=f1.b), (select 1 from c1 where f1.b = f2.b limit 1) from foo f1, bar f2 where f1.b = f2.b;
FATAL:  Unexpected internal error (cdbmutate.c:2593)
DETAIL:  FailedAssertion("!(plan->flow->flotype == FLOW_SINGLETON)", File: "cdbmutate.c", Line: 2593)

Greenplum version or build

6X branch

OS version and uname -a

Linux stable 5.4.0-177-generic #197-Ubuntu

autoconf options used ( config.status --config )

Installation information ( pg_config )

Expected behavior

Actual behavior

Step to reproduce the behavior

create table foo (a int, b int);
create table bar (a int, b int);
insert into foo select i, i from generate_series(1,10)i;
insert into bar select i, i from generate_series(1,10)i;
analyze;
explain with abc as (select sum(a)/sum(b) from foo ), c1 as (select sum(a)/sum(b) as col1 from bar) select (select 1 from abc where f1.b = f2.b limit 1), COALESCE((select 2 from abc where f1.a=random() and f1.a=2), 0), (select 1 from c1 where col1=f1.b), (select 1 from c1 where f1.b = f2.b limit 1) from foo f1, bar f2 where f1.b = f2.b;

yanwr1 avatar Apr 29 '24 01:04 yanwr1

I have tracked issue and find rca. make a little change and get query plan like this:

postgres=# explain with abc as (select sum(a)/sum(b) from foo ), c1 as (select sum(a)/sum(b) as col1 from bar) select (select 1 from abc where f1.b = f2.b limit 1), COALESCE((select 2 from abc where f1.a=random() and f1.a=2), 0), (select 1 from c1 where col1=f1.b), (select 1 from c1 where f1.b = f2.b limit 1) from foo f1, bar f2 where f1.b = f2.b;
                                                                QUERY PLAN                                                              
  
----------------------------------------------------------------------------------------------------------------------------------------
--
 Sequence  (cost=0.00..1852061508387.47 rows=4 width=16)
   ->  Shared Scan (share slice:id 0:0)  (cost=0.00..431.00 rows=1 width=1)
         ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
               ->  Result  (cost=0.00..431.00 rows=1 width=8)
                     ->  Aggregate  (cost=0.00..431.00 rows=1 width=16)
                           ->  Gather Motion 3:1  (slice9; segments: 3)  (cost=0.00..431.00 rows=1 width=16)
                                 ->  Aggregate  (cost=0.00..431.00 rows=1 width=16)
                                       ->  Seq Scan on foo foo_1  (cost=0.00..431.00 rows=4 width=8)
   ->  Sequence  (cost=0.00..1852061507956.47 rows=4 width=16)
         ->  Shared Scan (share slice:id 0:1)  (cost=0.00..431.00 rows=1 width=1)
               ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
                     ->  Result  (cost=0.00..431.00 rows=1 width=8)
                           ->  Aggregate  (cost=0.00..431.00 rows=1 width=16)
                                 ->  Gather Motion 3:1  (slice8; segments: 3)  (cost=0.00..431.00 rows=1 width=16)
                                       ->  Aggregate  (cost=0.00..431.00 rows=1 width=16)
                                             ->  Seq Scan on bar bar_1  (cost=0.00..431.00 rows=4 width=8)
         ->  Gather Motion 3:1  (slice7; segments: 3)  (cost=0.00..1852061507525.47 rows=11 width=16)
               ->  Result  (cost=0.00..1852061507525.47 rows=4 width=16)
                     ->  Hash Left Join  (cost=0.00..1808652953.85 rows=337 width=20)
                           Hash Cond: ((foo.b)::bigint = share1_ref3.col1)
                           ->  Hash Join  (cost=0.00..862.00 rows=4 width=12)
                                 Hash Cond: (foo.b = bar.b)
                                 ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.00 rows=4 width=8)
                                       Hash Key: foo.b
                                       ->  Seq Scan on foo  (cost=0.00..431.00 rows=4 width=8)
                                 ->  Hash  (cost=431.00..431.00 rows=4 width=4)
                                       ->  Redistribute Motion 3:3  (slice2; segments: 3)  (cost=0.00..431.00 rows=4 width=4)
                                             Hash Key: bar.b
                                             ->  Seq Scan on bar  (cost=0.00..431.00 rows=4 width=4)
                           ->  Hash  (cost=431.00..431.00 rows=1 width=12)
                                 ->  Result  (cost=0.00..431.00 rows=1 width=12)
                                       ->  Broadcast Motion 1:3  (slice3)  (cost=0.00..431.00 rows=3 width=8)
                                             ->  Shared Scan (share slice:id 3:1)  (cost=0.00..431.00 rows=1 width=8)
                           SubPlan 1  (slice7; segments: 3)
                             ->  Result  (cost=0.00..431.00 rows=1 width=4)
                                   ->  Limit  (cost=0.00..431.00 rows=1 width=1)
                                         ->  Result  (cost=0.00..431.00 rows=1 width=1)
                                               One-Time Filter: (foo.b = bar.b)
                                               ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
                                                     ->  Broadcast Motion 1:3  (slice4)  (cost=0.00..431.00 rows=3 width=1)
                                                           ->  Result  (cost=0.00..431.00 rows=1 width=1)
                                                                 ->  Shared Scan (share slice:id 4:0)  (cost=0.00..431.00 rows=1 width=1
)
                           SubPlan 2  (slice7; segments: 3)
                             ->  Result  (cost=0.00..431.03 rows=1 width=4)
                                   ->  Result  (cost=0.00..431.03 rows=1 width=1)
                                         One-Time Filter: (foo.a = 2)
                                         Filter: ((foo.a)::double precision = random())
                                         ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
                                               ->  Broadcast Motion 1:3  (slice5)  (cost=0.00..431.00 rows=3 width=1)
                                                     ->  Result  (cost=0.00..431.00 rows=1 width=1)
                                                           ->  Shared Scan (share slice:id 5:0)  (cost=0.00..431.00 rows=1 width=1)
                     SubPlan 3  (slice7; segments: 3)
                       ->  Result  (cost=0.00..431.07 rows=1 width=4)
                             ->  Limit  (cost=0.00..431.07 rows=1 width=1)
                                   ->  Result  (cost=0.00..431.07 rows=1 width=1)
                                         One-Time Filter: (foo.b = bar.b)
                                         ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
                                               ->  Broadcast Motion 1:3  (slice6)  (cost=0.00..431.00 rows=3 width=1)
                                                     ->  Result  (cost=0.00..431.00 rows=1 width=1)
                                                           ->  Shared Scan (share slice:id 6:1)  (cost=0.00..431.00 rows=1 width=1)
 Optimizer: Pivotal Optimizer (GPORCA)
(61 rows)

There are two shared inout node(shared_id=0 and share_id=1) execute on the qd slice(slice_id=0), but have different flow type:

(gdb) p *sisc
$88 = {scan = {plan = {type = T_ShareInputScan, plan_node_id = 35, startup_cost = 0, total_cost = 431.00000199999999, plan_rows = 1, 
      plan_width = 1, targetlist = 0x564712d4e8c0, qual = 0x0, lefttree = 0x0, righttree = 0x0, initPlan = 0x0, extParam = 0x0, 
      allParam = 0x0, flow = 0x564712d4e9d0, dispatch = DISPATCH_UNDETERMINED, directDispatch = {isDirectDispatch = 0 '\000', 
        contentIds = 0x0}, nMotionNodes = 0, nInitPlans = 0, sliceTable = 0x0, operatorMemKB = 0, motionNode = 0x0}, scanrelid = 0}, 
  share_type = SHARE_MATERIAL, share_id = 1, driver_slice = 6, discard_output = 0 '\000'}
(gdb) p *plan->flow
$90 = {type = T_Flow, flotype = FLOW_SINGLETON, req_move = MOVEMENT_NONE, locustype = CdbLocusType_Null, segindex = -1, 
  numsegments = 0, hashExprs = 0x0, hashOpfamilies = 0x0, segidColIdx = 0, flow_before_req_move = 0x0}
(gdb) p *(ShareInputScan*)plan
$102 = {scan = {plan = {type = T_ShareInputScan, plan_node_id = 43, startup_cost = 0, total_cost = 431.00020699999999, plan_rows = 1, 
      plan_width = 1, targetlist = 0x5647119f6558, qual = 0x0, lefttree = 0x5647119f6690, righttree = 0x0, initPlan = 0x0, 
      extParam = 0x0, allParam = 0x0, flow = 0x5647119f8060, dispatch = DISPATCH_UNDETERMINED, directDispatch = {
        isDirectDispatch = 0 '\000', contentIds = 0x0}, nMotionNodes = 1, nInitPlans = 0, sliceTable = 0x0, operatorMemKB = 0, 
      motionNode = 0x0}, scanrelid = 0}, share_type = SHARE_MATERIAL_XSLICE, share_id = 0, driver_slice = -1, 
  discard_output = 0 '\000'}
(gdb) p *plan->flow
$100 = {type = T_Flow, flotype = FLOW_UNDEFINED, req_move = MOVEMENT_NONE, locustype = CdbLocusType_Null, segindex = 0, 
  numsegments = 0, hashExprs = 0x0, hashOpfamilies = 0x0, segidColIdx = 0, flow_before_req_move = 0x0}

RCA: when call shareinput_mutator_xslice_1, add share_id=1 to ctxt->qdShares(due to fow_type=FLOW_SINGLETON), and when call shareinput_mutator_xslice_4, assert all the nodes under qdShares's executing slice all have flow=NULL or flow->flotype == FLOW_SINGLETON, shared_scan(with share_id=0) break the assert(due to fow_type=FLOW_FLOW_UNDEFINED).

why the two share input node have different flow?

Firstly, when call TranslateDXLCTEConsumerToSharedScan, set flow to null, and add node to hash map m_cte_consumer_info, and when called TranslateDXLCTEProducerToSharedScan retrieve the flow of the shared input scan of the cte consumers: if all cte consumers' flow are null will set to producer's flow to default flow type FLOW_UNDEFINED (like share_id=0); else will set all the CTE consumers and producer to the same flow type.

But for shared_input node with share_id=1, when create motion node with motion_id=3, have set below share_input node's flow type to FLOW_SINGLETON, so another shareinput node's flow type is FLOW_SINGLETON.

CTranslatorDXLToPlStmt::TranslateDXLMotion
// create flow for child node to distinguish between singleton flows and all-segment flows
	Flow *flow = MakeNode(Flow);

	const IntPtrArray *input_segids_array = motion_dxlop->GetInputSegIdsArray();


	// only one sender
	if (1 == input_segids_array->Size())
	{
		flow->segindex = *((*input_segids_array)[0]);

		// only one segment in total
		if (1 == gpdb::GetGPSegmentCount())
		{
			if (flow->segindex == MASTER_CONTENT_ID)
				// sender is on master, must be singleton flow
				flow->flotype = FLOW_SINGLETON;
			else
				// sender is on segment, can not tell it's singleton or
				// all-segment flow, just treat it as all-segment flow so
				// it can be promoted to writer gang later if needed.
				flow->flotype = FLOW_UNDEFINED;
		}
		else
		{
			// multiple segments, must be singleton flow
			flow->flotype = FLOW_SINGLETON;
		}
	}
	else
	{
		flow->flotype = FLOW_UNDEFINED;
	}

	child_plan->flow = flow;

yanwr1 avatar Apr 29 '24 02:04 yanwr1