tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

QueryAST

Open fulmicoton opened this issue 2 years ago • 4 comments

Currently tantivy's query are based on the following traits.

- trait Query
- trait Weight
- trait Scorer

They do not expose the structure of the query, and make it easy to extend queries.

It might be interesting however to introduce a QueryAST

enum QueryAST {
   TermQuery(Box<TermQuery>),
   Range(Box<RangeQuery>) 
   Boolean(BooleanQueryAST), //< or somethingelse
   ...
   Other(Box<dyn Query>)
}

And the equivalent for weight...

Such an AST could help debug, optimization operations, and could be a natural target for different query DSL.

fulmicoton avatar Jan 16 '23 13:01 fulmicoton

@guilload @evanxg852000 I'd like to have your thought on this?

fulmicoton avatar Jan 16 '23 13:01 fulmicoton

Just passing by, but it would be cool. Almost in every place I was working with search engines, there were kind of Query DSL. In Summa, I also have proto-based query tree: https://github.com/izihawa/summa/blob/master/summa-proto/proto/query.proto#L14

ppodolsky avatar Jan 16 '23 13:01 ppodolsky

https://github.com/quickwit-oss/quickwit/issues/1655

fulmicoton avatar Jan 17 '23 13:01 fulmicoton

This would make some optimizations easier, e.g. for

(Field1:Term1 OR Field1:Term2) AND (Field2: Term1 OR Field2:Term2), it would be better to use a simple union-algorithm that supports fast skips instead of the current one.

For that we would need to know that above the union is an intersection that triggers the skips. In the current generic API it's possible, but strange to pass down that information.

PSeitz avatar Jul 16 '24 02:07 PSeitz