Window function on msq
This PR aims to introduce Window functions on MSQ by doing the following:
- Introduce a Window querykit for handling window queries along with its factory and a processor for window queries
- If a window operator is present with a partition by clause, pushes the partition as a shuffle spec of the previous stage
- In presence of empty OVER() clause lets all operators loose on a single rac
- In presence of no empty OVER() clause, breaks down each window into individual stages
- Associated machinery to handle window functions in MSQ
- Introduced a separate hidden engine feature
WINDOW_LEAF_OPERATORwhich is set only for MSQ engine. In presence of this feature, the planner plans without the leaf operators by creating a window query over an inner scan query. In case of native this is set to false and the planner generates the leafOperators
This PR has:
- [x] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
- [x] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [ ] added integration tests.
- [x] been tested in a test Druid cluster.
Updated the code to handle leaf operators in window queries. Now on MSQ, window functions can be run without the group by.
@cryptoe I have addressed your comments. I'll appreciate it if you can take another look
Added guardrails with a context param, tests on Wikipedia datasets for replaces and selects with scans and group bys, not entertaining boosting in windows.
Updated the release notes taking into account the follow up PR as well https://github.com/apache/druid/pull/16229