druid
druid copied to clipboard
Support non time order in MSQ compaction
Description
https://github.com/apache/druid/pull/16849 added support for sorting segments with non-time columns. This PR extends that support to MSQ compaction. Specifically, if forceSegmentSortByTime is set in the data schema -- either in the user-supplied compaction config or in the inferred schema -- the following steps are taken:
- Skip adding
__timeexplicitly as the first column to the dimension schema since it already comes as part of the schema - Ensure column mappings propagate
__timein the order specified by the schema - Set
forceSegmentSortByTimein the MSQ context.
Also, the PR adds (missing) unit tests for verifying MSQ spec generated with nested and auto-type columns
This PR has:
- [x] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [] added integration tests.
- [x] been tested in a test Druid cluster.