druid icon indicating copy to clipboard operation
druid copied to clipboard

Window function on msq

Open somu-imply opened this issue 2 years ago • 2 comments

This PR aims to introduce Window functions on MSQ by doing the following:

  1. Introduce a Window querykit for handling window queries along with its factory and a processor for window queries
  2. If a window operator is present with a partition by clause, pushes the partition as a shuffle spec of the previous stage
  3. In presence of empty OVER() clause lets all operators loose on a single rac
  4. In presence of no empty OVER() clause, breaks down each window into individual stages
  5. Associated machinery to handle window functions in MSQ
  6. Introduced a separate hidden engine feature WINDOW_LEAF_OPERATOR which is set only for MSQ engine. In presence of this feature, the planner plans without the leaf operators by creating a window query over an inner scan query. In case of native this is set to false and the planner generates the leafOperators

This PR has:

  • [x] been self-reviewed.
    • [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
  • [x] added documentation for new or modified features or behaviors.
  • [ ] a release note entry in the PR description.
  • [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • [ ] added or updated version, license, or notice information in licenses.yaml
  • [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • [ ] added integration tests.
  • [x] been tested in a test Druid cluster.

somu-imply avatar Dec 01 '23 20:12 somu-imply

Updated the code to handle leaf operators in window queries. Now on MSQ, window functions can be run without the group by.

somu-imply avatar Jan 23 '24 22:01 somu-imply

@cryptoe I have addressed your comments. I'll appreciate it if you can take another look

somu-imply avatar Mar 15 '24 04:03 somu-imply

Added guardrails with a context param, tests on Wikipedia datasets for replaces and selects with scans and group bys, not entertaining boosting in windows.

somu-imply avatar Mar 21 '24 20:03 somu-imply

Updated the release notes taking into account the follow up PR as well https://github.com/apache/druid/pull/16229

cryptoe avatar May 22 '24 05:05 cryptoe