Adrien Grand
Adrien Grand
I'd rather do this by relying on a stable sort than by tracking an additional variable. It will require doing everything by sorting instead of using a `select()` first to...
I wonder if this is something that could be implemented in the merge scheduler rather than in the merge policy. Thinking out loud: the merge policy's responsibility is to compute...
Intuitively, I had thought of the "throttle at start" approach, where we would also give `MS` the ability to filter out some merges from `MP` (so that they don't get...
> That's the only thing that prevents MergePolicy from e.g. simply picking that merge again. I wonder if we actually need to prevent it from picking the same merge again....
I wonder if you should override `intoBitSet` to delegate to the wrapped iterator. This would copy bits in batches instead of one-by-one. This is something that not only happens when...
(Said otherwise, I agree that we're currently over-estimating the performance of pre-filtering by enabling a rare optimization that is extremely effective, but in my opinion your patch is making us...
This makes me wonder if we could somehow benchmark pre-filtering against a `TermQuery` as a filter to make it more realistic.
It's still linear unfortunately. But performance of loading filters based on postings lists into bit sets should be ~3x better since Lucene 10.2 (cf. annotations HS, HX and HY at...
Reading this issue made me wonder if we could make points indexes better by indexing them by doc ID, ie. treating doc IDs like any other dimension. Then there should...
> Maybe there is a hybrid approach? For example, when concurrent segment search is being initialized, it can try calling clone() for Scorer/BulkScorer, but if it throws CloneNotSupportedException, we fall...