Michael McCandless

Results 216 comments of Michael McCandless

SPANN is another option? https://www.researchgate.net/publication/356282356_SPANN_Highly-efficient_Billion-scale_Approximate_Nearest_Neighbor_Search

(listening to @jbellis talk at Community over Code).

Or perhaps we "just" make a Lucene Codec component (KnnVectorsFormat) that wraps jvector? (https://github.com/jbellis/jvector)

> I've got my framework set up for testing larger than memory indexes and have some somewhat interesting first results. Thank you for setting this up @kevindrosendahl -- these are...

I don't think you need to wrap `ReaderContext` classes -- you can create your new `TimeoutLeafReader` class, subclassing `FilterLeafReader`, and overriding the methods (likely with additional wrapping on their returned...

Hmm I'm confused: why would you need to get to the `TimeoutLeafReader`? Don't you create this timeout reader, passing the timeout to it (which will apply to all queries) and...

Hi @Deepika0510 -- what is the problem when callers access the leaves? Since you would subclass `FilterLeafReader` (which subclasses `LeafReader`) it should be fine to existing code? Like that line...

Whoa, very cool @jpountz! This reminds me of [this longstanding issue/paper](https://github.com/apache/lucene/issues/4036) which also inlined skip data directly in the postings, but maybe was still multi-level?

I think this is a reasonable hook to add, but could you maybe add javadocs to the new protected method? Maybe something like: ``` /** Applications can subclass and override...

I wonder whether `Arrays.sort` might be a good choice instead of making our own powerful sorting classes? [OpenJDK is (gradually?) taking advantage of fast SIMD sorting](https://github.com/apache/lucene/issues/12399) so at some point...