Robert Muir
Robert Muir
i dont agree, I think the problems are flaws with the HNSW and can't be worked around. Its too slow already at 768 and in fact the current limit overpromises...
Hi, a couple suggestions: 1. Somehow, we need to avoid Vector API code inside the MemorySegment code. Just because MemorySegment is available, does not mean Vector API is usable, one...
> I'm surprised by how slow this is with AVX off given that this can be implemented with SSE2 :(. Yes, it is surprising: we found the same situation with...
we should also be careful about introducing complex CharFilters, I consider the current CharFilter api broken after debugging #11976 see https://github.com/apache/lucene/issues/11976#issuecomment-1328150137
Closing as the PR has been merged and is in the 9.5.0 section of CHANGES.txt
yes this would be nice when discussing issues such as https://github.com/apache/lucene/issues/12203 otherwise, I think merges are currently too opaque when discussing index performance: but we "know" certain parts are way...
I think it is enough to just use a bigger vector size that better represents the performance issues? Maybe it looks like the current graph for users only using 100...
thanks a lot for posting these indexer benchmarks @msokolov
everything in search is an approximation: BM25, etc etc. There's absolutely no reason to give KNN some kind of free pass to leniency park. Leniency isn't going to help anything...
we could track recall/precision/MAP for BM25 scoring too, but we don't. we are strict and it gives some confidence that scoring is working correctly: hasn't changed unless we intended it...