lucene
lucene copied to clipboard
Make dynamic range facets value collection and sorting faster
Description
DynamicRangeUtil collects values from each segment and then sorts all the values in the main thread. I wonder if we could get a speed-up from collecting values in each segment, sorting them (maybe doing insertion sort), and then merging the sorted values from all segments, effectively moving more of the work to the executor and doing less in the main thread.
I can take a look at this one next week @stefanvodita, if you don't mind
Please do! I'm happy to help with reviews or if you have questions about dynamic ranges.
Cool, I've found some performance improvements (~10-15%), which can be reproduced through a new jmh benchmark I've added. I'll open a PR the next few days and tag you :)
Another idea that @HoustonPutman had was to collect all the results, then use Quick Sort with the right pivots to determine the quantiles we care about without sorting the entire dataset. Curious what improvements you found @timgrein!
@timgrein , I've posted a PR for my idea that @stefanvodita mentioned. If you have the JMH benchmark, I'd love to test it out on mine as well.
Another option could be to draw inspiration from the Learned Sort algorithm (refer to https://blog.acolyer.org/2020/10/19/the-case-for-a-learned-sorting-algorithm/ and https://learnedsystems.mit.edu/defeating-dups-learned-sort/), which demonstrates particularly fast sorting capabilities also for low-cardinality fields.
Learned Sort looks amazing -- @josefschiefer27 maybe open a dedicated spinoff issue to see if there are other places where it could help Lucene? Lucene does a lot of sorting ... e.g. sorting terms in the per-segment terms dictionary on flush.
I think a more general issue for learned sort already exists: #12463
@timgrein , I've posted a PR for my idea that @stefanvodita mentioned. If you have the JMH benchmark, I'd love to test it out on mine as well.
@HoustonPutman Sounds good, how would you prefer to test it? I can create a PR with the jmh benchmark and we can check, whether it's valuable in the sense of capturing your improvement correctly and then merge it indepedently.
I still have 1-2 ideas in mind I can contribute afterwards. These were more Java specific things and not on the algorithmic side. Speaking of that your change looks like an impressive improvement, very cool that this is also applicable to other parts of the codebase! :)
@timgrein - publishing the benchmark as an independent PR sounds like a great idea!
Just checking in @timgrein, did you share the benchmark you were referencing?