Benjamin Trent comments

Results 373 comments of


                                            Benjamin Trent

Fix pre-filter performance testing to truly indicate cost

Maybe the thing to do is to know the percentage provided in the indexer and just randomly set a "true/false" field.

Fix pre-filter performance testing to truly indicate cost

LOL, doing a big termset query increases latency to `70.758` from `2.490` (I upped my test data to be 1M float32 vectors). I will try a simple term "true/false" that...

Fix pre-filter performance testing to truly indicate cost

OK, doing a `true/false` dense filter is much cheaper and creating the bit set (not very extensible to Lucene util :/) `2.796` vs `2.116`, this is over 1M docs. JFR...

Fix pre-filter performance testing to truly indicate cost

I wonder if eager evaluation of very dense filters scales logarithmically like HNSW search does. I would expect not? It seems like even if we grab chunks of docs at...

Fix pre-filter performance testing to truly indicate cost

> I could imagine filters on ranges as well (e.g. filtering recent data). I'd expect these two (term and range queries) to cover a vast majority of use-cases? For sure,...

Unmute #111529

@ChrisHegarty looks like only two repeatable failures. Opened separate issues for those.

Unmute #111529

@elasticmachine update branch

Unmute #111529

@elasticmachine update branch

Add NDCG and full precision reranking to knn benchmarks

Cool, I like the "baby steps" approach.

Replace need for KnnVectorValues.copy() with a dictionary interface

> but copy a smaller wrapper around any shared scratch vector data The scratch arrays also need new instances. You are doing this as well correct?