BlackLab icon indicating copy to clipboard operation
BlackLab copied to clipboard

Use two-phase iterators

Open jan-niestadt opened this issue 3 years ago • 2 comments

Two-phase iterators are a mechanism in Lucene that can speed up SpanQueries by immediately skipping over documents that cannot possibly contain any matches, before the term vectors are fetched. It might speed up more complex queries.

jan-niestadt avatar Dec 15 '21 13:12 jan-niestadt

Lucene docs: https://lucene.apache.org/__root/docs.lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/spans/Spans.html#asTwoPhaseIterator--

It will probably be necessary to read the Lucene code to get a decent understanding of how it uses two-phase iterators before implementing our own.

jan-niestadt avatar Apr 14 '22 07:04 jan-niestadt

We might also want to refer to Mtas, which already uses two-phase iterators.

jan-niestadt avatar Jul 04 '22 10:07 jan-niestadt

We should probably also look at Weight.isCachable(). See BLSpanWeight.

jan-niestadt avatar Apr 18 '23 12:04 jan-niestadt