tantivy
tantivy copied to clipboard
Enrich delete query
Currently, deletes are defined by a term query.
For quickwit, we will need to execute two kinds of delete queries:
- A general purpose delete query defined by terms, boolean operators, etc.
- A delete query defined by a field and a list of values (typically IDs, potentially hundreds of thousands of them).
I don't really know how to implement that in tantivy so I leave this to you @PSeitz @fulmicoton :)
Incidentally, doing it like regular deletes is possible, because tantivy is rather naive in the way it handles deletes.
Deletes are computed after serializing segments. This is suboptimal, but that's the state of tantivy today, so we can maybe abuse it as long as we mark the operation as #[doc(hidden)]
The easiest way to implement this feature is therefore probably to replace the Term in DeleteOperation by a Query.
We can then add a fn delete_query(&mut self, query: Box<dyn Query>) to the IndexWriter and mark it as #[doc(hidden)].
The tricky part might be that we do not have access to a Searcher run .weight(..) on.
fixed by #1535 and #1539