luceneutil
luceneutil copied to clipboard
Add benchmark covering BINARY doc values query-time performance
When we (Amazon product search) upgraded to Lucene 8.5.1, which includes newly added block compression for BINARY doc values, we saw a sizable (~30%) reduction in our red-line QPS (throughput). This might be unique to how we are using BINARY doc values. Other applications would likely not be so heavily affected.
However, it looks like Lucene's nightly benchmarks failed to even detect this performance impact.
There were benchmarks run as the change was being developed, but for some reason they did not seem to show any impact at all, which is surprising.
Let's add something to luceneutil to better test BINARY doc values query-time performance.
I didn't know Lucene offered sorting on a BinaryDocValuesField; I thought it was only value retrieval? It's not obvious to me from the diff how the sorting is happening; can you give me a pointer to something here and/or in Lucene please?