tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

Enrich delete query

Open fmassot opened this issue 3 years ago • 1 comments

Currently, deletes are defined by a term query.

For quickwit, we will need to execute two kinds of delete queries:

  • A general purpose delete query defined by terms, boolean operators, etc.
  • A delete query defined by a field and a list of values (typically IDs, potentially hundreds of thousands of them).

I don't really know how to implement that in tantivy so I leave this to you @PSeitz @fulmicoton :)

fmassot avatar Aug 29 '22 16:08 fmassot

Incidentally, doing it like regular deletes is possible, because tantivy is rather naive in the way it handles deletes. Deletes are computed after serializing segments. This is suboptimal, but that's the state of tantivy today, so we can maybe abuse it as long as we mark the operation as #[doc(hidden)]

The easiest way to implement this feature is therefore probably to replace the Term in DeleteOperation by a Query. We can then add a fn delete_query(&mut self, query: Box<dyn Query>) to the IndexWriter and mark it as #[doc(hidden)].

The tricky part might be that we do not have access to a Searcher run .weight(..) on.

fulmicoton avatar Aug 30 '22 07:08 fulmicoton

fixed by #1535 and #1539

trinity-1686a avatar Sep 28 '22 08:09 trinity-1686a