TTL-based Result Level Caching For Queries Touching Realtime Segments
Description
Want to support result-set caching of queries hitting realtime data nodes. Want to create a way to partition the result set of a query (from either realtime/historical data nodes) into cacheable granular intervals that can either pulled from cache and stitched into the query result, or issued as a query to data nodes.
Providing a TTL query context header would dictate how "recent" of an interval we'd want to serve from cache, and otherwise query from data nodes. Something like cacheTTL: "PT1M" would tell the brokers to serve from cache all results that were > PT1M ago, and issue queries to data nodes for data <PT1M.
Motivation
This would allow for result-set caching of queries against realtime segments, significantly boosting performance in exchange for staler data, configurable by the user. For longer-running stream ingestion jobs (e.g. where a realtime segment contains last 1h of data) the "staleness" imposed by this feature would likely be negligible, assuming low-to-no late records.
Want to create a way to partition the result set of a query (from either realtime/historical data nodes) into cacheable granular intervals that can either pulled from cache and stitched into the query result, or issued as a query to data nodes.
This is how the per-segment cache works— the granular intervals are "each segment". I think it would generally make sense to combine that with a cache ttl for realtime segments. (As opposed to the current strategy which is to not cache them at all.)