gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Add GUC to stop penalizing correlated NLJ operators

Open DevChattopadhyay opened this issue 11 months ago • 6 comments

Currently in ORCA's cost model the nested correlated NLJ operators are penalized the same way as nested NLJ's. But it should not be the case. The nested Correlated NLJs in the ORCA physical plan aren't actually nested after they're converted to subplans. So we should not penalize them like nested NLJ's.

For example:

Setup:

create table foo (a int,b int, c int);
create table bar (p int);

Query:

explain (costs off) select case when (select count(*)
                  from foo
                  where a between 1 and 20) > 48409437
            then (select avg(b)
                  from foo
                  where a between 1 and 20)
            else (select avg(c)
                  from foo
                  where a between 1 and 20) end bucket1 
from bar
where p = 1
;

Nested Loop Plan:

 Gather Motion 3:1  (slice1; segments: 3)
   ->  Nested Loop Left Join
         Join Filter: true
         ->  Nested Loop Left Join
               Join Filter: true
               ->  Nested Loop Left Join
                     Join Filter: true
                     ->  Seq Scan on bar
                           Filter: (p = 1)
                     ->  Materialize
                           ->  Broadcast Motion 1:3  (slice6)
                                 ->  Aggregate
                                       ->  Gather Motion 3:1  (slice7; segments: 3)
                                             ->  Seq Scan on foo foo_2
                                                   Filter: ((a >= 1) AND (a <= 20))
               ->  Materialize
                     ->  Broadcast Motion 1:3  (slice4)
                           ->  Aggregate
                                 ->  Gather Motion 3:1  (slice5; segments: 3)
                                       ->  Seq Scan on foo foo_1
                                             Filter: ((a >= 1) AND (a <= 20))
         ->  Materialize
               ->  Broadcast Motion 1:3  (slice2)
                     ->  Aggregate
                           ->  Gather Motion 3:1  (slice3; segments: 3)
                                 ->  Seq Scan on foo
                                       Filter: ((a >= 1) AND (a <= 20))
 Optimizer: GPORCA
(28 rows)

Plan containing subplans:

 Gather Motion 3:1  (slice1; segments: 3)
   ->  Seq Scan on bar
         Filter: (p = 1)
         SubPlan 1
           ->  Materialize
                 ->  Broadcast Motion 1:3  (slice2)
                       ->  Aggregate
                             ->  Gather Motion 3:1  (slice3; segments: 3)
                                   ->  Seq Scan on foo
                                         Filter: ((a >= 1) AND (a <= 20))
         SubPlan 2
           ->  Materialize
                 ->  Broadcast Motion 1:3  (slice4)
                       ->  Aggregate
                             ->  Gather Motion 3:1  (slice5; segments: 3)
                                   ->  Seq Scan on foo foo_1
                                         Filter: ((a >= 1) AND (a <= 20))
         SubPlan 3
           ->  Materialize
                 ->  Broadcast Motion 1:3  (slice6)
                       ->  Aggregate
                             ->  Gather Motion 3:1  (slice7; segments: 3)
                                   ->  Seq Scan on foo foo_2
                                         Filter: ((a >= 1) AND (a <= 20))
 Optimizer: GPORCA
(25 rows)

So from the example it is observed that nested Correlated NLJs in the ORCA physical plan aren't actually nested after they're converted to subplans. Due to this penalization it is observed that ORCA sometimes generate a plan where the subplans are executed on coordinator instead of segments.

Fix: Created a GUC optimizer_enable_penalize_correlated_nljoin whose default value is set to ON. With the default value ORCA will continue to penalize correlated NLJ's similar to NLJ's. If the GUC is set to off then ORCA will not penalize nested correlated NLJ's based on the following condition (considering that the inner child of a correlated NLJ is converted to a subplan) If the outer child of a correlated NLJ is a correlated NLJ and its inner child is not a correlated NLJ then we will not penalize the correlated NLJ operator. If the inner child is a Correlated NLJ operator (subplans within subplans i.e nested subplans), we will continue to penalize it.

Note to reviewers: I have broken down the PR in logical commits. Will recommend to go through each commit separately. Also I have compiled all the plan changes observed by turning off the GUC in JIRA[1]. Please do take a look. [1] : https://jira.eng.vmware.com/browse/GPQP-433

Here are some reminders before you submit the pull request

  • [ ] Add tests for the change
  • [ ] Document changes
  • [ ] Communicate in the mailing list if needed
  • [ ] Pass make installcheck
  • [ ] Review a PR in return to support the community

DevChattopadhyay avatar Mar 22 '24 09:03 DevChattopadhyay

What is the different between this and optimizer_nestloop_factor

vraghavan78 avatar Mar 22 '24 17:03 vraghavan78

What is the different between this and optimizer_nestloop_factor

	{
		{"optimizer_nestloop_factor", PGC_USERSET, QUERY_TUNING_OTHER,
			gettext_noop("Set the nestloop join cost factor in the optimizer"),
			NULL,
			GUC_NOT_IN_SAMPLE
		},
		&optimizer_nestloop_factor,
		1024.0, 1.0, DBL_MAX,
		NULL, NULL, NULL
	},
	// amplify NLJ cost based on NLJ factor and stats estimation risk
	// we don't want to penalize index join compared to nested loop join, so we make sure
	// that every time index join is penalized, we penalize nested loop join by at least the
	// same amount
	CDouble dPenalization = dNLJFactor;
	const CDouble dRisk(pci->Pcstats()->StatsEstimationRisk());
	if (dRisk > dPenalization)
	{
		const CDouble dIndexJoinAllowedRiskThreshold =
			pcmgpdb->GetCostModelParams()
				->PcpLookup(
					CCostModelParamsGPDB::EcpIndexJoinAllowedRiskThreshold)
				->Get();
		if (dIndexJoinAllowedRiskThreshold < dRisk)
		{
			dPenalization = dRisk;
		}
	}

	return CCost(costTotal * dPenalization);
  • Currently in main the GUC optimizer_nestloop_factor is a factor by which the total cost of PhysicalNLJoins or PhysicalCorrelatedNLJ operators are multiplied to penalize it. The default value of this GUC is 1024 and it can range from 1 to 1024. The penalization also depends on the Stats estimation risk for all the NLJ operators.

  • The GUC (optimizer_enable_penalize_correlated_nljoin) created in this PR is turned ON by default. In this case the NLJ as well as the correlated NLJ operators will be penalized based on the value of optimizer_nestloop_factor.

  • But when the GUC optimizer_enable_penalize_correlated_nljoin is turned OFF the correlated NLJ operators will not be penalized using the NLJ factor but the other NLJ operators will be penalized using the NLJ factor.

  • So the difference is that optimizer_nestloop_factor is basically a factor to determine how much to penalize a NLJ operator and optimizer_enable_penalize_correlated_nljoin decides whether we need to penalize correlated NLJ operators with NLJ factor or not.

DevChattopadhyay avatar Mar 25 '24 11:03 DevChattopadhyay

Why are we keeping penalization on by default? Risk?

When we don't penalize, do we see any worse plans? Can we add some of that reasoning to the PR

chrishajas avatar Mar 25 '24 17:03 chrishajas

Why are we keeping penalization on by default? Risk?

When we don't penalize, do we see any worse plans? Can we add some of that reasoning to the PR

Yes @chrishajas we observed some worse plans with the GUC turned OFF while running workloads. So there is some risk associated. Thats why the GUC is turned ON by default. I have added the reasoning in the commit message. Also i have compiled all the plan changes in the JIRA.

DevChattopadhyay avatar Mar 26 '24 12:03 DevChattopadhyay

Why are we keeping penalization on by default? Risk? When we don't penalize, do we see any worse plans? Can we add some of that reasoning to the PR

Yes @chrishajas we observed some worse plans with the GUC turned OFF while running workloads. So there is some risk associated. Thats why the GUC is turned ON by default. I have added the reasoning in the commit message. Also i have compiled all the plan changes in the JIRA.

Do we know why those plans are worse? Is it a separate issue related to the cost model, or is this change less robust than we thought?

chrishajas avatar Mar 26 '24 17:03 chrishajas

Why are we keeping penalization on by default? Risk? When we don't penalize, do we see any worse plans? Can we add some of that reasoning to the PR

Yes @chrishajas we observed some worse plans with the GUC turned OFF while running workloads. So there is some risk associated. Thats why the GUC is turned ON by default. I have added the reasoning in the commit message. Also i have compiled all the plan changes in the JIRA.

Do we know why those plans are worse? Is it a separate issue related to the cost model, or is this change less robust than we thought?

@chrishajas As discussed offline the regressions observed are due to costing. I have mentioned some snippets of ORCA physical plan tree below having regressions. Let me know if it looks ok then i will add it to the commit message.

Downsides of this approach:

  • On turning the GUC [optimizer_enable_penalize_correlated_nljoin] to OFF observed some plan changes in our workloads. Some changes were expected to happen like plan changes from nested NLJ to subplans as we stopped penalizing the nested correlated NLJ operators since they are actually not nested. They just seem to be nested in our ORCA physical plan tree. For example: We can see in the query mentioned above in the commit message that the plan changed from nested NLJ to subplans.

  • But also observed that in some cases that on turning OFF the GUC got some regressions. These regressions are due to costing. Since we stopped the penalization of nested correlated NLJ, in some cases there cost became better than the expected good plan. Here are some examples:

Example 1:

Plan tree 1:

...
+--CPhysicalLeftAntiSemiHashJoinNotIn
      |--CPhysicalInnerHashJoin
           |--CPhysicalInnerNLJoin  
            ...

Plan tree 2:

...
+--CPhysicalCorrelatedInnerNLJoin
      |--CPhysicalCorrelatedInnerNLJoin
            |--CPhysicalCorrelatedInnerNLJoin
                ...

Observed a plan change from tree 1 to tree 2. The estimated row count for a correlated NLJ is always 1000. So the inner most correlated NLJ in tree2 is costed lower than the innermost NLJ in tree 1. Due to this the costing of tree 2 becomes better than tree 1. But the Plan tree 1 was a better alternative containing Hash Joins but due to costing observed the plan shifted to tree 2.

Example 2:

Plan tree 1:

...
+--CPhysicalLeftOuterNLJoin [A]
      |--CPhysicalLeftAntiSemiHashJoinNotIn   rows:61   width:27  rebinds:1   cost:3883911.828828   origin: [Grp:28, GrpExpr:26]
            |--CPhysicalMotionGather(coordinator) 
                    |--CPhysicalLeftSemiNLJoin
                        ...

Plan tree 2:

...
+--CPhysicalCorrelatedLeftOuterNLJoin [B]
        |--CPhysicalCorrelatedInnerNLJoin
            |--CPhysicalCorrelatedInnerNLJoin
                ...

In Plan tree 1 the operator A is a NLJ so it will be penalized by a nlj factor of 1024 but in plan tree 2 since the child of operator B is a correlated join it will not be penalized. This will create a cost difference between both the plans. Observed a plan change from tree 1 to tree 2.

DevChattopadhyay avatar Mar 28 '24 11:03 DevChattopadhyay