Adrien Grand
Adrien Grand
Thank you, we may want to look into making this better in follow-ups, but this looks good enough for me so I merged. :+1:
I'm taking a look now.
The seed did not reproduce for me, but I think I understand the problem. The code assumes that if `a + b > c` then `a - ε + b...
I worked on improving the `ScorerUtil` test so that it would catch this problem. It helped me find another problem. I pushed directly. I think we're good now.
Nightly benchmarks confirmed the speedup: https://benchmarks.mikemccandless.com/OrStopWords.html
I remember (but I don't remember where) seeing someone doing multi-tenant vector search by using a flat vector index and enabling index sorting on the tenant ID. Then vector search...
Thanks @mikemccand ! I'll wait a few days before merging to give others a chance to take a look.
To be fair, this chart suggests a quite dramatic degradation over time, but these big drops are mostly due to the benchmark becoming harder by increasing the number of dimensions...
I'll need @benwtrent or @msokolov to provide an educated answer to this question, but the issue mentions that the previous value also had worse recall.
@akhilesh-k Nobody else is on it that I know of. I expect conflicts with #14963 but I don't think that it should refrain you from giving it a try.