quickwit
quickwit copied to clipboard
Remove timerange root search
Extract and remove time range from query ast.
To allow for the optimization stage to run if the resulting ast after extracting the time range is a simple query.
@trinity-1686a can you review?
before i review further, i have a major concern to express. for mostly historic reasons, start/end_timestamps are only precise up to the second, but the ast can express sub-second precision, up to the nanosecond. the previous code purposefully left the range query inside the ast for that reason (it had some handling of rounding already, but that's only to make sure the extracted timestamps are never more restrictive than the query is meant to be, leaving the exact precision to the ast)
as far as i can tell, it seems in the state this PR is, a query with sub-second precision may return results it shouldn't (for instance ts:[2025-05-13T12:00:00.123 TO *] could return doc with a ts of 2025-05-13T12:00:00.000), and not return a result it should (ts:{2025-05-13T12:00:00.123 TO *] would round up because the bound is excluding, not matching docs with a ts of 2025-05-13T12:00:00.999)
@trinity-1686a My memory is failing me. I remember you already added some optim around extracting timestamp from the AST. What is the current status of timestamp checking before and after this PR? (e.g. what happens if we have a time range elastic search query, do we have time pruning and where does the range filtering happen. (post filtering? rangequery))
The secs/subsecs problem is a problem but we can probably work around it.
before i review further, i have a major concern to express. for mostly historic reasons, start/end_timestamps are only precise up to the second, but the ast can express sub-second precision, up to the nanosecond. the previous code purposefully left the range query inside the ast for that reason (it had some handling of rounding already, but that's only to make sure the extracted timestamps are never more restrictive than the query is meant to be, leaving the exact precision to the ast)
as far as i can tell, it seems in the state this PR is, a query with sub-second precision may return results it shouldn't (for instance
ts:[2025-05-13T12:00:00.123 TO *]could return doc with a ts of2025-05-13T12:00:00.000), and not return a result it should (ts:{2025-05-13T12:00:00.123 TO *]would round up because the bound is excluding, not matching docs with a ts of2025-05-13T12:00:00.999)
I understand now your concern, it also makes my realize that there's no reason to try to extract twice (once in the root and once in the leaf).
Maybe we should also fix the root to not convert to seconds on extraction, and work with Bounds<DateTime>?
before:
- in root.rs, we have a QueryAstVisitor that is used to extract approximate (second precision) start/end timestamp. It does not mutate the AST. This is called rather early in query evaluation, and allow for time pruning.
- in leaf.rs, we have a QueryAstTransformer, it removes all
musttimestamp ranges, and summarize them (along start/end_timestamps) into a single range. It check if that range is more restrictive than the split bounds, and drop the range entirely if it isn't. I don't expect people to put multiple timestamp ranges in their queries, but the use for this code is twofold: we don't naively add a range based on query parameters (which may have been inferred from the ast), so we save us from duplicating the timestamp ranges (saves some compute); and we remove bounds entirely, before warmup, when we know they are useless, saving both compute and io. (after that step, we do some query simplification while transforming into a tantivy QueryAst, which may or may not be able to do a better job because we removed some clauses, but this isn't specific to timestamps) - The resulting range query are eventually turned into tantivy FastFieldRangeQuery. There is no longer any collector post-filtering for timestamps
after: (please correct me @tontinton if i say anything wrong)
- in root.rs, a QueryAstTransformer is used to extract timestamps ranges to the query params, and remove them from the ast (as is, causing a loss of precision). this allows for time pruning.
- in leaf.rs, the same QueryAstTransformer is used, though it shoudn't find much in all likelihood it won't find anything. This could actually be simplified further by adding the bounds only if they are more restrictive than the split bounds (assuming start/end_timestamps are capable of full ns precision), without looking into the ast.
- the range are eventually turned into FastFieldRangeQuery, as before
pros: would remove a bunch of nearly-duplicate code, and allow us to do only a single pass on the ast instead of two as currently done @tontinton could you clarify this? it's not obvious to me what optimization you mean.
To allow for the optimization stage to run if the resulting ast after extracting the time range is a simple query.
cons: as is, this doesn't always return the correct result.
path forward: to make this PR returns the correct result, we'd need start/end_timestamp to be ns precision. To that effect, every usage of start/end_timestamp would need to be checked. To not have to search manually (which would be error-prone both to write and to review), we could retire start/end_timestamp from SearchRequest, introduce start/end_timestamp_ns in their place, and add start/end_timestamp() helpers to get the second-precision value as before. The compiler should be able to find for us every use of the second-precision bounds
Maybe we should also fix the root to not convert to seconds on extraction, and work with Bounds<DateTime>?
i think Bounds<DateTime> would be a tad annoying because SearchRequest must be protobuf-serializable, but an Option<i64> that happens to contain ns is likely good enough
i think
Bounds<DateTime>would be a tad annoying because SearchRequest must be protobuf-serializable, but anOption<i64>that happens to contain ns is likely good enough
Cool I'll work on it once I have some time, hopefully this week / next week.
@trinity-1686a @tontinton What would be the ideal solution according to you?
My take:
- the start_timestamp end_timestamp would be removed from the protobuf SearchRequest object.
- the quickwit API (not elastic) would integrate the bound from the rest api into the query ast
- we would work on the ast as an optimization to bubble up/merge/remove time range from the tree whenever possible in the root
- we would extract start/end timestamp on the ast for split pruning (in the root)
- in leaf, individually for each split, we would run an extra optimization pass to apply split specific optimization. e.g: We would replace timerange node by an always match query when the split boundary are within the range, and run a simplification pass to remove those always match nodes. Maybe there is another optimization for splits that simply do not have this timestamp field.
i would like to understand what this is about, because it could change what is the ideal solution
To allow for the optimization stage to run if the resulting ast after extracting the time range is a simple query.
i think a good solution should deduplicate the code, having both a QueryAstVisitor and a QueryAstTransformer doing almost the same thing is definitely not ideal. i don't think visiting the ast multiple time is really expensive, so we could get rid of start/end_timestamp and have the query ast be the only source of trust, but that might interfere with the optimisation @tontinton has in mind
and run a simplification pass to remove those always match nodes.
this is something we already do on all request while turning them to tantivy Query. Also, that's a lot more error prone than it seems, there is a subtle bug with the .filter(|result| result != &Ok(QueryAst::MatchAll)) added in this PR, when a bool query has a single .must that gets removed, and it also has some .should.
tldr: depending on what that optimisation from the 1st message in the PR is, i think removing start/end_timestamp is probably the best solution (which implies the other points in the list)
@tontinton would that make sense?
@fulmicoton-dd @trinity-1686a
The optimizations I talk about are stuff like is_metadata_count_request_with_ast checking that the query is a MatchAll query. I thought of this PR after making this PR.
this is something we already do on all request while turning them to tantivy Query. Also, that's a lot more error prone than it seems, there is a subtle bug with the .filter(|result| result != &Ok(QueryAst::MatchAll)) added in this PR, when a bool query has a single .must that gets removed, and it also has some .should.
True, this should be fixed.
@tontinton would that make sense?
To be honest, I'm not 100% following, you're saying keep the start and end timestamp in the AST, and let leafs optimize out when they are contained in the time range to convert to a MatchAll, which will allow stuff like fn optimize( in leaf.rs to be optimized even with a time range, something like what I did here.
But what about the optimization PR I've linked to get the count from metastore (first link)? That code is in the root still, would that code also convert the query to MatchAll?
@trinity-1686a I leave it to you to review and judge when and what should be merge, and merge the PRs: #5760 #5759 #5758