druid
druid copied to clipboard
add missing information to topn and groupby cache keys
Description
Fixes an issue with the cache key information used to encode ordering for top n and group by queries, which just refer to the name. However, since many agg and post-agg implementations do not add the output name to the cache key, constructing the ordering cache key information just by name is insufficient to ensure that the cached data is actually sorted in the same manner.
This PR fixes this by appending extra cache information for the inputs of the ordering information, basically duplicating the key information of the dimspec, aggregator factory, or postagg as appropriate. Alternatively, we could have added output name to all agg and postagg cache keys, but it seems like that could result in slightly lower cache hit rate so this way seems preferable to me at least.
This PR has:
- [x] been self-reviewed.
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [x] been tested in a test Druid cluster.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.