Michael McCandless
Michael McCandless
I don't understand why we would need to move away from segments, to tinker more with async IO? Whatever we could do with a single segment, we could also do...
> Each term initially gets a page for its posting and related data. If the term is popular, progressively assign it chunks of multiple pages. Pages/chunks are chained using forward...
This might be a needle-moving optimization for apps that reuse a single `TermsEnum` and seek randomly to terms, right? Because all up and down the stack of `SegmentTermsEnumFrame`s we can...
Thanks @vsop-479 I will try to re-engage here soon!
> This doesn't look like a problem with regular KNN vector queries, only appears with parent-join query benchmarks. Hmm it's odd for the 500K docs case that recall is so...
Whoa, exciting! I will try to review soon! Thanks @gf2121.
I've also opened https://github.com/mikemccand/luceneutil/issues/267 to understand why our nightly benchmarks didn't notice this. @uschindler maybe you have an idea!
> improve things somewhat (~6x, fwiw) Uhm, ~6x seems a lot more than just a "somewhat" to me! It's spooky that such a workaround (forcing only one thread to `.close()`...
> Could Lucene ever have this directly in one of its modules? We currently use the `FlatVectorsScorer` to plugin the "native code optimized" alternative, when scoring Scalar Quantized vectors. But...
> It makes some changes to the build: specifically the java code statically picks the best MethodHandle (SVE, Neon, Generic), and its able to compile Generic on any architecture/compiler (e.g....