Adrien Grand

Results 310 comments of Adrien Grand

> but I guess as an execution strategy it kind of made sense to me -- is it really necessary to clone Scorers? Could we create new ones for each...

> I guess one alternative is to maintain multiple IndexSearchers with different characteristics Since IndexSearcher is very cheap to create, you could create a new `IndexSearcher` for every search? This...

I'd really like to keep intra-segment parallelism simple and stick to splitting the doc ID space, which is the most natural approach for queries that produce good iterators like term...

> An outright madvise call should be about as expensive as the isLoaded check when things are already in the page cache The PR where `consecutivePrefetchHitCount` was introduced had a...

Well, you may be right as well that the cost of `MS::isLoaded` is of a similar order of magnitude as `madvise`. What the current logic does is that if you...

> Seems we just trade an isLoaded for an madvise on systems with enough memory? This is correct. I made this suggestion because it was similar to your initial proposal:...

`FlatVectorsFormat` is an internal abstraction layer for vectors formats that helps configure the way vectors are stored (e.g. quantized or not) independently from how they're indexed. I'm not a fan...

Thanks, I had missed the quantization requirement and that you were ok with configuring a codec on the `IndexWriter`.

Thank you, this looks good. If you have cycles to run benchmarks, this would be appreciated, you can check out this: https://github.com/mikemccand/luceneutil/blob/main/README.md#running-the-knn-benchmark.

The only missing thing is an entry in lucene/CHANGES.txt but we can deal with it later.