druid icon indicating copy to clipboard operation
druid copied to clipboard

MSQ: Write summary row for GROUP BY ().

Open gianm opened this issue 1 year ago • 8 comments

This allows us to remove "msqIncompatible()" from a couple of tests that involve GROUP BY () summary rows.

To handle the case where the SQL layer drops a dimension like GROUP BY 'constant', this patch also adds a "hasDroppedDimensions" context flag to the groupBy query.

gianm avatar Apr 24 '24 04:04 gianm

I think this PR also enables the following test in MSQ testCountStarWithLongColumnFiltersOnFloatLiterals

These are already supported by MSQ, however marked as msqIncompatible. Since they seemed relevant to your changes, we could enable them in the same PR. testGroupByNothingWithImpossibleTimeFilter testGroupByNothingWithLiterallyFalseFilter testGroupingWithNotNullPlusNonNullInFilter testGroupingWithNullPlusNonNullInFilter

LakshSingla avatar Jun 12 '24 13:06 LakshSingla

Grouping#hasGroupingDimensionsDropped accomplishes something similar like CTX_HAS_DROPPED_DIMENSIONS, however it only lives in the SQL layer. I think we can drop hasGroupingDimensionsDropped and use the newly added context parameter to avoid redundancy.

In this patch I am using Grouping#hasGroupingDimensionsDropped to initialize CTX_HAS_DROPPED_DIMENSIONS, so I think I need to keep it. At the point the Grouping object is created, we don't have a query context yet for that specific query, so we couldn't put things in the query context. (All we have is the global query context, but we don't want CTX_HAS_DROPPED_DIMENSIONS set in the global context. It should only be set in the context for the specific query object that actually has dropped dimensions.)

gianm avatar Jun 26 '24 17:06 gianm

I think this PR also enables the following test in MSQ testCountStarWithLongColumnFiltersOnFloatLiterals

These are already supported by MSQ, however marked as msqIncompatible. Since they seemed relevant to your changes, we could enable them in the same PR. testGroupByNothingWithImpossibleTimeFilter testGroupByNothingWithLiterallyFalseFilter testGroupingWithNotNullPlusNonNullInFilter testGroupingWithNullPlusNonNullInFilter

Making this change revealed a bug in this patch: COUNT in SQL initializes to null, not 0 as expected, because the null-agg-frame is written with the combining aggregators. The combining aggregator of count is longSum, which initializes to null. To fix this we'll either need to:

  • add a parameter to longSum like initToZero that can be set true when it's created as the combining aggregator of a count
  • or, move the null-frame-witing from post-shuffle to pre-shuffle. (pre-shuffle we use the regular aggs, not combining aggs.)

I'm leaning towards the first one since it's technically the most correct. It sounds better to me if the combining agg of count initializes to 0, not null as it does today.

gianm avatar Jun 26 '24 17:06 gianm

The first one seems better than the second one to me as well.

LakshSingla avatar Jun 27 '24 04:06 LakshSingla

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Aug 27 '24 00:08 github-actions[bot]

This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

github-actions[bot] avatar Sep 24 '24 00:09 github-actions[bot]

Not stale

LakshSingla avatar Sep 24 '24 04:09 LakshSingla

COUNT in SQL initializes to null, not 0 as expected, because the null-agg-frame is written with the combining aggregators. The combining aggregator of count is longSum, which initializes to null.

Oops, in looking into this more, I misunderstood the problem. It was actually not this. The null-agg-frame is written with the regular aggregators, not the combining ones. The real problem is an off-by-one in updating the resultRow, which is fixed in the latest push.

In the case where the query has granularity: ALL but also has intervals: [], the result row includes __time (as a result of changes in #10968). I pushed a change to account for this: check for __time in the result row signature, and if it's there:

  • Set it to 0L (it's going to be ignored anyway)
  • Copy the aggregation results into the result row starting from getResultRowAggregatorStart(), not position 0. This was the off-by-one that caused the test failure.

gianm avatar Oct 10 '24 19:10 gianm

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jan 08 '25 00:01 github-actions[bot]