gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Support intermediate aggregate in planner

Open charliettxx opened this issue 2 years ago • 3 comments

Support intermediate aggregate in planner on GPDB7.

AS we know, count(distinct xxx) cannot use hashagg. Usually we handle count(distinct) by single groupagg in MPP which maybe slow when sort involved. But we could use two hashagg instread of groupagg in count(distinct) case.

select count(distinct a), sum(b) from t group by c;

-->  Final HashAgg
       output count(a), sum(b)
       group by a
    --> Partial HashAgg
           output a, c, sum(b)
           group by a, c
       --> Distribute Motion
           distribute by a, c
            --> Partial HashAgg
                 output a, c, sum(b)
                 group by a, c

In this plan we specify sum(b) as intermediate Aggref. Plan with Intermediate aggref have as same multi-hashagg strategy as DQAaggs. So what we mainly do in this pr is adding intermediate agg targetlist in multiaggs and modify set_plan_refs in Agg plan node.

AGGSPLIT_DQAWITHAGG is aggsplit of final agg plan node, and there are two aggsplit_simple and aggsplit_combine Aggref existing in targetlist at same time, so need to reference aggsplit_simple Aggref to Vars in child plan node and reference aggsplit_combine Aggref to partial Aggref.

--> Final HashAgg
       sum(a), avg(b)
    --> Partial HashAgg
        a, partial avg(b)

in this case, sum(a) should reference to column a and avg(b) should reference (partial avg(b))

If someone who doesn't like DQA agg path could just turn-off guc gp_enable_agg_distinct, and optimizer will choose old path instead.

Here are some reminders before you submit the pull request

  • [ ] Add tests for the change
  • [ ] Document changes
  • [ ] Communicate in the mailing list if needed
  • [ ] Pass make installcheck
  • [ ] Review a PR in return to support the community

charliettxx avatar Aug 19 '22 03:08 charliettxx

origin issue: https://github.com/greenplum-db/gpdb/issues/10462#ref-pullrequest-1279149517

charliettxx avatar Aug 19 '22 03:08 charliettxx

If someone who doesn't like DQA agg path could just turn-off guc gp_enable_agg_distinct, and optimizer will choose two stage agg path instead.

Would you give some examples for the good and bad cases?

interma avatar Aug 29 '22 07:08 interma

If someone who doesn't like DQA agg path could just turn-off guc gp_enable_agg_distinct, and optimizer will choose two stage agg path instead.

Would you give some examples for the good and bad cases?

Thanks for comment. For my experience, like query --> select count(distinct a), sum(b) from t group by c; if you turn-off gp_enable_agg_distinct, the only plan you can get is

-->GroupAgg
     group by c
    --> Sort
           sort by c
          --> Redistribute Motion
                 distribute by c

If we have too many duplicate tuple on column c, this's heavy work for motion. So the best way is we did a further deduplicate before group by Motion and this is advantage of DQAagg. And In OLAP system, DQAgg is almost better than others I think. It's a hard work for us choose a better plan based on cost between them. But user could use guc gp_enable_agg_distinct to get best plan they want.

charliettxx avatar Aug 30 '22 09:08 charliettxx

The description is hard to be understood, because of not only the complexity of the problem, but also the description style...I'd suggest to reword in some ways:

select count(distinct a), sum(b) from t group by c;

-->  Final HashAgg
       output count(a), sum(b)
       group by a
    --> Partial HashAgg
           output a, c, sum(b)
           group by a, c
       --> Distribute Motion
           distribute by a, c
            --> Partial HashAgg
                 output a, c, sum(b)
                 group by a, c

I'd suggest to explictly mark it as a "new" version of intermediate agg, and attach a "old" version of groupagg. The version of groupagg in planner like this:

testdb=# explain verbose select count(distinct a), sum(b) from t2 group by c;
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=2197.59..2539.76 rows=1000 width=20)
   Output: count(DISTINCT a), sum(b), c
   Group Key: t2.c
   ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=2197.59..2507.26 rows=3000 width=20)
         Output: c, (PARTIAL count(DISTINCT a)), (PARTIAL sum(b))
         Merge Key: c
         ->  Partial GroupAggregate  (cost=2197.59..2467.26 rows=1000 width=20)
               Output: c, PARTIAL count(DISTINCT a), PARTIAL sum(b)
               Group Key: t2.c
               ->  Sort  (cost=2197.59..2262.51 rows=25967 width=12)
                     Output: c, a, b
                     Sort Key: t2.c
                     ->  Seq Scan on public.t2  (cost=0.00..293.67 rows=25967 width=12)
                           Output: c, a, b
 Optimizer: Postgres query optimizer
 Settings: optimizer = 'off'

The plan of old version you provided to @interma is likely a version from ORCA, but we are working on planner, right?

Plan with Intermediate aggref have as same multi-hashagg strategy as DQAaggs.

My suggestion: Plan with Intermediate aggref has multi-hashagg strategy as same as DQAaggs.

...and there are two aggsplit_simple and aggsplit_combine Aggref existing in targetlist at same time, so need to reference aggsplit_simple Aggref to Vars in child plan node and reference aggsplit_combine Aggref to partial Aggref.

My suggestion: and there are two Aggrefs, aggsplit_simple and aggsplit_combine, existing in targetlist at same time. So we need to refer aggsplit_simple Aggref to Vars in child plan node and refer aggsplit_combine Aggref to partial Aggref.

Finally, it seems the basic idea of intermediate aggref is converting a problem of "Single DQA with Aggregate(s)" to "Multiple DQAs" by transforming the single agg (sum(b)) to a temporary DQA aggref. Is that understanding correct?

Thanks for review! I've update summary and fix some bug of this PR above. The basic idea of intermediate aggref is: If input of this aggref has the same combining type as output of this aggref, and we call this agggref as intermediate aggref.

->HashAgg (to aggregate)
    output: sum(a), c, count(b)
    -> HashAgg (to eliminate duplicates)
         output: a, c, **count(b)**
        -> Streaming HashAgg (to eliminate duplicates)
             output: a, c, count(b)
              -> input

count(b) in middle HashAgg is an intermediate aggref, for the input of this aggref is from count(b) in Streaming HashAgg, it will also output combing type to the top Agg.

charliettxx avatar Nov 07 '22 09:11 charliettxx