Optimize prefix sum computation in Lucene99HnswVectorsReader; fixes #15024
Replaced the two-step prefix sum loop in Lucene99HnswVectorsReader with a single-loop variant that avoids redundant memory access and improves performance.
Previous approach:
- Read first value separately.
- Then used previous buffer element + readVInt().
New approach:
- Accumulates sum in a single pass and assigns directly.
This change follows the suggestion from issue #14979 and has the same functional behavior with slightly better efficiency.
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.
Thank you, this looks good. If you have cycles to run benchmarks, this would be appreciated, you can check out this: https://github.com/mikemccand/luceneutil/blob/main/README.md#running-the-knn-benchmark.
Thank you, this looks good. If you have cycles to run benchmarks, this would be appreciated, you can check out this: https://github.com/mikemccand/luceneutil/blob/main/README.md#running-the-knn-benchmark.
Thank you, Glad it looks good. Before proceeding, I just wanted to check, is there anything you'd like me to adjust or improve before it's ready to be merged?
Also, I’ll take a look at the benchmark tool you linked and try to run some comparisons if time allows. Appreciate the reference!
The only missing thing is an entry in lucene/CHANGES.txt but we can deal with it later.
Perfect. I will be waiting for updates on that matter. Thank you!
@yossev FYI you'll need to resolve conflicts since another change touched the same lines of code. I'm curious if you had any success with benchmarking?
@yossev FYI you'll need to resolve conflicts since another change touched the same lines of code. I'm curious if you had any success with benchmarking?
Thanks for the heads-up! I noticed that the recent changes in Lucene99HnswVectorsReader.java reverted part of the optimization I made according to #14979
I’ll resolve the merge conflicts by keeping that fix (unless there’s a reason it should be reverted), and will adjust it if needed to fit the new changes.
Regarding benchmarking ,I’ll run it after the conflicts are resolved so results reflect the latest code.
I was checking to see what had been reverted, but I don't see it? #14979 didn't change Lucene99HnswVectorReader.java, right? Maybe you're referring to the conflicts since #14932 moved the VInt-decoding block into a conditional?
I was checking to see what had been reverted, but I don't see it? #14979 didn't change Lucene99HnswVectorReader.java, right? Maybe you're referring to the conflicts since #14932 moved the VInt-decoding block into a conditional?
I was just giving credit to #14979 to the optimization i made in this case, not saying it changed anything.
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.
Hello @jpountz,
Please check the latest commit, i resolved the conflicts locally, and added back the optimization that was mentioned #14979.
i will be waiting for further feedback.
@yossev see the output of the build failure, the build is unhappy with formatting, you need to run ./gradlew tidy. Have you had any success with benchmarking?
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
I would be happy to merge and backport this once merge conflicts and formatting is tidied @yossev
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
Hello @yossev friendly ping :). Once conflicts & formatting are fixed, I can merge and backport.
We should also have a CHANGES entry under optimizations for 10.4.0
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!