lucene Optimize `PointRangeQuery` for intra-segment concurrency with segment-level `DocIdSet` caching

Description

This PR optimizes PointRangeQuery to efficiently support intra-segment concurrent search by implementing segment-level DocIdSet caching. When a large segment is split into multiple partitions for parallel processing, all partitions now share a single BKD tree traversal result instead of each partition performing redundant traversals. The solution was derived as part of discussion from this PR https://github.com/apache/lucene/pull/15383. Related issue for PointRangeQuery with https://github.com/apache/lucene/issues/13745 intra-segment.

Problem

With intra-segment concurrency enabled, a single segment can be split into multiple partitions, each processed by a different thread. In the current implementation, each partition independently traverses the BKD tree and builds its own DocIdSet, resulting in Query latency https://github.com/apache/lucene/pull/13542#issuecomment-2332114836 and redundant/duplicate BKD crawl.

Solution

Implement a segment level cache that ensures the BKD tree is traversed only once per segment, with the resulting DocIdSet shared across all partitions:

SegmentDocIdSetSupplier: A new helper class that lazily builds and caches the DocIdSet for an entire segment.
Segment-level cache: A ConcurrentHashMap<LeafReaderContext, SegmentDocIdSetSupplier> in the Weight that ensures all partitions of the same segment share the same supplier.
PartitionScorerSupplier: A new ScorerSupplier implementation that references the shared cache and filters results to the partition's doc ID range.
PartitionFilteredDocIdSetIterator: A lightweight iterator wrapper that filters the shared full-segment DocIdSet to only return docs within the partition's range.
Pending once need to update the cost() methods right and add the tests along with some code cleanup. Here are some local testing details https://github.com/apache/lucene/pull/15446#issuecomment-3568992048.
The behavior is same when intra-segment is disabled, handled in existing scorerSupplier(LeafReaderContext context) method.

Performance Impact: Seen good improvement with `IntNRQ`

Tested with enabling intra-segment on both candidate and baseline.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                         Respell       21.82     (11.8%)       19.72      (7.4%)   -9.6% ( -25% -   10%) 0.123
       BrowseDayOfYearSSDVFacets        3.14     (18.6%)        3.01     (12.2%)   -4.0% ( -29% -   32%) 0.688
            HighTermTitleBDVSort       10.51      (3.7%)       10.11      (4.5%)   -3.7% ( -11% -    4%) 0.152
             MedIntervalsOrdered       36.53      (7.0%)       35.68      (6.3%)   -2.3% ( -14% -   11%) 0.576
                    OrNotHighLow      435.41      (3.5%)      425.81      (1.3%)   -2.2% (  -6% -    2%) 0.192
         AndHighMedDayTaxoFacets       63.83      (1.8%)       62.45      (4.0%)   -2.2% (  -7% -    3%) 0.271
               HighTermMonthSort      207.85      (4.6%)      203.35      (4.0%)   -2.2% ( -10% -    6%) 0.431
                   OrNotHighHigh       36.19     (16.2%)       35.57     (11.9%)   -1.7% ( -25% -   31%) 0.849
           BrowseMonthSSDVFacets        3.12      (9.4%)        3.07     (13.6%)   -1.5% ( -22% -   23%) 0.839
                      HighPhrase       22.87      (2.6%)       22.58      (2.6%)   -1.3% (  -6% -    3%) 0.437
           HighTermDayOfYearSort       25.58      (2.4%)       25.26      (3.1%)   -1.3% (  -6% -    4%) 0.477
                     LowSpanNear       17.70      (3.0%)       17.53      (3.1%)   -1.0% (  -6% -    5%) 0.617
                HighSloppyPhrase        9.39      (2.8%)        9.32      (2.8%)   -0.7% (  -6% -    4%) 0.674
                    HighSpanNear       14.31      (3.7%)       14.21      (0.7%)   -0.7% (  -4% -    3%) 0.692
                    OrHighNotLow      148.21     (12.2%)      147.26     (11.5%)   -0.6% ( -21% -   26%) 0.932
                      TermDTSort       26.59      (4.0%)       26.43      (2.2%)   -0.6% (  -6% -    5%) 0.760
                   OrHighNotHigh       50.27     (11.8%)       50.05     (12.1%)   -0.4% ( -21% -   26%) 0.955
                       OrHighMed      111.27      (8.8%)      110.89      (8.6%)   -0.3% ( -16% -   18%) 0.950
                      AndHighMed      210.73      (2.8%)      210.06      (1.9%)   -0.3% (  -4% -    4%) 0.834
                        Wildcard       25.46      (1.5%)       25.40      (2.3%)   -0.3% (  -3% -    3%) 0.835
                    OrNotHighMed       60.00     (14.7%)       59.88     (12.5%)   -0.2% ( -23% -   31%) 0.981
            BrowseDateSSDVFacets        0.53     (16.0%)        0.53     (18.6%)   -0.1% ( -29% -   41%) 0.994
                        HighTerm      242.89      (7.4%)      243.56     (10.1%)    0.3% ( -16% -   19%) 0.961
                           range     2796.48      (7.4%)     2805.50      (3.1%)    0.3% (  -9% -   11%) 0.928
       BrowseDayOfYearTaxoFacets        2.09      (7.3%)        2.09     (11.5%)    0.4% ( -17% -   20%) 0.952
                      OrHighHigh       35.14      (9.7%)       35.32     (12.6%)    0.5% ( -19% -   25%) 0.943
                 MedSloppyPhrase       11.84      (1.4%)       11.91      (3.9%)    0.6% (  -4% -    6%) 0.746
                         Prefix3       33.61      (3.1%)       33.84      (2.8%)    0.7% (  -5% -    6%) 0.717
             LowIntervalsOrdered       99.26      (2.7%)       99.96      (3.8%)    0.7% (  -5% -    7%) 0.737
            MedTermDayTaxoFacets       16.22      (6.0%)       16.35      (8.0%)    0.8% ( -12% -   15%) 0.859
            HighIntervalsOrdered        2.98     (12.3%)        3.01      (8.0%)    0.8% ( -17% -   24%) 0.897
                          IntSet      140.75      (4.4%)      142.36      (5.8%)    1.1% (  -8% -   11%) 0.726
        AndHighHighDayTaxoFacets       12.74      (5.4%)       12.90      (3.5%)    1.3% (  -7% -   10%) 0.647
               HighTermTitleSort       14.03      (1.3%)       14.22      (2.6%)    1.3% (  -2% -    5%) 0.295
     BrowseRandomLabelTaxoFacets        1.72      (5.8%)        1.75      (4.9%)    1.4% (  -8% -   12%) 0.688
          OrHighMedDayTaxoFacets        1.15      (3.8%)        1.17      (5.0%)    1.5% (  -6% -   10%) 0.580
                       MedPhrase       58.94      (4.5%)       59.93      (4.7%)    1.7% (  -7% -   11%) 0.561
                          Fuzzy1       34.72      (7.2%)       35.33      (5.7%)    1.8% ( -10% -   15%) 0.670
                      AndHighLow      531.18      (1.7%)      541.24      (6.5%)    1.9% (  -6% -   10%) 0.527
           BrowseMonthTaxoFacets        2.17      (8.9%)        2.21     (12.3%)    2.0% ( -17% -   25%) 0.772
                       LowPhrase       12.05      (3.8%)       12.34      (3.4%)    2.4% (  -4% -    9%) 0.294
                 LowSloppyPhrase       16.99      (2.2%)       17.40      (3.2%)    2.4% (  -2% -    7%) 0.162
                         LowTerm      459.98     (16.0%)      472.15     (17.4%)    2.6% ( -26% -   42%) 0.803
     BrowseRandomLabelSSDVFacets        2.12      (8.3%)        2.18     (13.3%)    2.9% ( -17% -   26%) 0.678
                     MedSpanNear        4.41      (5.3%)        4.54      (7.2%)    3.1% (  -8% -   16%) 0.434
                     AndHighHigh       48.99      (9.9%)       50.53     (11.9%)    3.1% ( -17% -   27%) 0.650
                       OrHighLow      342.27      (6.5%)      353.27      (3.7%)    3.2% (  -6% -   14%) 0.334
                    OrHighNotMed      117.56     (12.8%)      122.42     (11.0%)    4.1% ( -17% -   31%) 0.582
                         MedTerm      273.79     (15.1%)      285.16     (13.9%)    4.2% ( -21% -   39%) 0.652
                          Fuzzy2       38.07      (9.0%)       39.69     (11.0%)    4.2% ( -14% -   26%) 0.502
                        PKLookup      139.01     (11.7%)      146.39      (7.1%)    5.3% ( -12% -   27%) 0.386
            BrowseDateTaxoFacets        2.02      (5.9%)        2.16     (11.7%)    7.1% (  -9% -   26%) 0.228
                          IntNRQ       12.30      (3.8%)       30.18      (8.2%)  145.3% ( 128% -  163%) 0.000

Related Issues

https://github.com/apache/lucene/issues/13745
~https://github.com/apache/lucene/issues/14485

Nov 24 '25 05:11 prudhvigodithi

Before I add some tests, tested this behavior using https://github.com/msfroh/lucene-university (will check in the code here as well). Notice in the following logs:

A segment is divided into 5 partitions and part of 5 different slices.
Score supplier is called by all partitions for a the same segment ctx identity: 857068247.
All 5 threads get same supplier called on supplier #1557216666291 (SegmentDocIdSetSupplier) done by thread 41 from partition [400000, 800000)
All partitions share same cache entry supplier identity: 1536099041 (same for all 5).
BKD traversal happens only ONCE [BUILD_START] on thread 39, [BUILD_SKIP] on 4 other threads, so only 1 thread builds the DocIdSet, the other 4 threads reuse the cached result.

> Task :example.points.IntraSegmentPointRangeTest.main()
=== Intra-Segment Point Range Query Test ===

Step 1: Indexing documents...
Indexing 2000000 documents...
  Indexed 500000 documents...
  Indexed 1000000 documents...
  Indexed 1500000 documents...
Force merging to single segment...
Indexing complete!

Step 2: Opening reader and creating searcher...
Index info:
  Total docs: 2000000
  Number of segments: 1
  Segment 0: 2000000 docs

Creating IndexSearcher with 4 threads

=== Slice Information ===
Number of slices: 5
Slice 0:
  Number of partitions: 1
  Total docs in slice: 400000
    Partition 0:
      Segment: 0
      Doc range: [0, 400000)
      Doc count: 400000
Slice 1:
  Number of partitions: 1
  Total docs in slice: 400000
    Partition 0:
      Segment: 0
      Doc range: [400000, 800000)
      Doc count: 400000
Slice 2:
  Number of partitions: 1
  Total docs in slice: 400000
    Partition 0:
      Segment: 0
      Doc range: [800000, 1200000)
      Doc count: 400000
Slice 3:
  Number of partitions: 1
  Total docs in slice: 400000
    Partition 0:
      Segment: 0
      Doc range: [1200000, 1600000)
      Doc count: 400000
Slice 4:
  Number of partitions: 1
  Total docs in slice: 400000
    Partition 0:
      Segment: 0
      Doc range: [1600000, 2000000)
      Doc count: 400000

Step 3: Executing range query...
Query: value:[0 TO 1499999]
Expected matches: 1500000

Searching (multi-threaded)...


=== Multi-threaded Search Results ===
Total hits: 1500000
Time: 29ms

=== Verification ===
Expected: 1500000
Actual: 1500000
Result: ✓ CORRECT


=== Sample Results (Top 10) ===
Nov 23, 2025 5:12:50 PM org.apache.lucene.internal.vectorization.VectorizationProvider lookup
WARNING: Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API.
[SCORER_SUPPLIER] Called for segment 0 partition [0, 400000) on thread 3 ctx identity: 857068247
[SCORER_SUPPLIER] Called for segment 0 partition [400000, 800000) on thread 41 ctx identity: 857068247
[SCORER_SUPPLIER] Called for segment 0 partition [1200000, 1600000) on thread 39 ctx identity: 857068247
[CACHE_LOOKUP] Before computeIfAbsent, cache size: 0
[CACHE_LOOKUP] Before computeIfAbsent, cache size: 0
[SCORER_SUPPLIER] Called for segment 0 partition [800000, 1200000) on thread 40 ctx identity: 857068247
[CACHE_LOOKUP] Before computeIfAbsent, cache size: 0
[SCORER_SUPPLIER] Called for segment 0 partition [1600000, 2000000) on thread 38 ctx identity: 857068247
[CACHE_LOOKUP] Before computeIfAbsent, cache size: 0
[CACHE_LOOKUP] Before computeIfAbsent, cache size: 0
[CACHE_MISS] CREATING new SegmentDocIdSetSupplier for segment 0 on thread 41
[SUPPLIER_CREATED] SegmentDocIdSetSupplier #1557216666291 for segment 0
[CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041
[CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041
[CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041
[CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041
[CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041
[GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 39
[BUILD_CHECK] cachedDocIdSet is null, entering synchronized block
[BUILD_START] Building DocIdSet for segment 0 on thread 39
[GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 38
[BUILD_CHECK] cachedDocIdSet is null, entering synchronized block
[GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 3
[BUILD_CHECK] cachedDocIdSet is null, entering synchronized block
[GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 40
[BUILD_CHECK] cachedDocIdSet is null, entering synchronized block
[GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 41
[BUILD_CHECK] cachedDocIdSet is null, entering synchronized block
Disconnected from the target VM, address: 'localhost:55600', transport: 'socket'
[BUILD_COMPLETE] Built DocIdSet for segment 0 in 901ms
[BUILD_SKIP] Another thread already built DocIdSet
[BUILD_SKIP] Another thread already built DocIdSet
[BUILD_SKIP] Another thread already built DocIdSet
[BUILD_SKIP] Another thread already built DocIdSet
  Doc 0: value=0, score=1.0
  Doc 1: value=1, score=1.0
  Doc 2: value=2, score=1.0
  Doc 3: value=3, score=1.0
  Doc 4: value=4, score=1.0
  Doc 5: value=5, score=1.0
  Doc 6: value=6, score=1.0
  Doc 7: value=7, score=1.0
  Doc 8: value=8, score=1.0
  Doc 9: value=9, score=1.0

=== Cleanup ===
Shutting down executor...
Done!

Visual Flow

LeafReaderContext object (ctx identity: 857068247)
    ↑         ↑         ↑         ↑         ↑
    │         │         │         │         │
Partition  Partition  Partition  Partition  Partition
[0,400K)   [400K,800K) [800K,1.2M) [1.2M,1.6M) [1.6M,2M)
Thread 3   Thread 41   Thread 40   Thread 39   Thread 38

Thread 41: [CACHE_MISS] Creates supplier ─────────────┐
Thread 39: [CACHE_RESULT] Gets supplier ──┐           │
Thread 38: [CACHE_RESULT] Gets supplier ──┤           │
Thread 3:  [CACHE_RESULT] Gets supplier ──┤           │
Thread 40: [CACHE_RESULT] Gets supplier ──┘           │
                                          │           │
                                          ↓           │
                              All 5 threads have      │
                              same supplier           │
                                          │           │
                                          ↓           ↓
Thread 39: [BUILD_START] ← BUILDS the DocIdSet
Thread 38: [BUILD_SKIP]  ← Waits, then reuses the DocIdSet
Thread 3:  [BUILD_SKIP]  ← Waits, then reuses the DocIdSet
Thread 40: [BUILD_SKIP]  ← Waits, then reuses the DocIdSet
Thread 41: [BUILD_SKIP]  ← Waits, then reuses the DocIdSet (even though it created supplier!)

Nov 24 '25 05:11 prudhvigodithi

Looks like flaky test ?

./gradlew :lucene:join:test --tests "org.apache.lucene.search.join.TestBlockJoin.testScoreMode" -Ptests.asserts=true -Ptests.file.encoding=UTF-8 -Ptests.gui=true -Ptests.jvmargs= -Ptests.jvms=4 -Ptests.seed=3014C2CB4BB8490 -Ptests.vectorsize=512

Nov 24 '25 18:11 prudhvigodithi

I am having a hard time understanding why this PR is improving the query throughput of IntNRQ. Mi expectation is that the query expends most of the time traversing the BKD tree and very little time building the result. As this PR still traverses the BKD tree with one thread, I would expect very little change in the query latency. I did make a local test with one of my favourite datasets and I did not see any change on latency as expected.

More over, I would expect query throughput to be hurt by this change because all those blocked search threads doing no work, so concurrent queries will be running with less resources. Do you happen to know why QPS is improving? I might be missing something.

Nov 25 '25 08:11 iverase

This idea was inspired from comments https://github.com/apache/lucene/issues/13745#issuecomment-3062037144 and https://github.com/apache/lucene/pull/15383#issuecomment-3533814898.

Do you happen to know why QPS is improving?

The idea is instead of doing multiple same BKD traversal when divided into multiple partitions, do one BKD traversal per segment and share the DocIdSet to iterate over the partition specific documents.

I did make a local test with one of my favourite datasets and I did not see any change on latency as expected.

I assume you enabled intra segment is both cases ?

Nov 25 '25 12:11 prudhvigodithi

I would expect very little change in the query latency. I did make a local test with one of my favourite datasets and I did not see any change on latency as expected.

May I know if I could test that on my local as well ? For now I used https://github.com/mikemccand/luceneutil wikimediumall.

Nov 25 '25 12:11 prudhvigodithi

I assume you enabled intra segment is both cases ?

No, in the baseline I used current main with segment only concurrency. The candidate is this patch.

May I know if I could test that on my local as well ?

I test with the datasets used for lucene geospatial benchmarks: https://benchmarks.mikemccandless.com/geobench.html I merged the index to one segment and used the bounding box query which uses the PointRangeQuery.

Nov 25 '25 13:11 iverase

No, in the baseline I used current main with segment only concurrency. The candidate is this patch.

Can you test with both intra segment enabled (in this patch from this PR the intra segment is already enabled). FYI here is the past Intra segment search benchmarks from Lucene: https://github.com/apache/lucene/pull/13542#issuecomment-2332114836

Nov 25 '25 14:11 prudhvigodithi

I test with the datasets used for lucene geospatial benchmarks: https://benchmarks.mikemccandless.com/geobench.html I merged the index to one segment and used the bounding box query which uses the PointRangeQuery.

I see the same issue https://github.com/mikemccand/luceneutil/issues/372#issue-3005741642 when I want to run the geo benchmark. Let me see if I can still test the geospatial benchmarks with one segment and bounding box query.

Nov 25 '25 14:11 prudhvigodithi

No need, now I understand the results you are providing. I think you should provide the comparison with main for completeness (e.g is this solution competitive with the current status quo).

Nov 25 '25 14:11 iverase

No need, now I understand the results you are providing. I think you should provide the comparison with main for completeness (e.g is this solution competitive with the current status quo).

Thanks! Below are the results without enabling intra-segment search (on both lucene_candidate and lucene_baseline), which reflects the current behavior on main.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
            BrowseDateSSDVFacets        0.56     (16.1%)        0.50      (1.1%)  -11.0% ( -24% -    7%) 0.337
                    OrHighNotLow      221.92      (9.6%)      198.50     (10.8%)  -10.6% ( -28% -   10%) 0.301
                   OrHighNotHigh       99.49      (8.9%)       90.42      (3.9%)   -9.1% ( -20% -    4%) 0.186
         AndHighMedDayTaxoFacets       27.38      (3.2%)       25.24      (2.4%)   -7.8% ( -13% -   -2%) 0.006
                         MedTerm      352.99      (3.0%)      331.70      (2.7%)   -6.0% ( -11% -    0%) 0.036
               HighTermTitleSort       20.00      (2.1%)       19.13      (2.7%)   -4.3% (  -8% -    0%) 0.076
                       OrHighLow      354.88     (14.1%)      341.84      (7.3%)   -3.7% ( -22% -   20%) 0.744
                           range     1531.41      (3.8%)     1478.39      (5.7%)   -3.5% ( -12% -    6%) 0.476
                HighSloppyPhrase        9.15      (1.0%)        8.88      (1.3%)   -3.0% (  -5% -    0%) 0.009
            MedTermDayTaxoFacets        7.75     (13.9%)        7.58      (9.0%)   -2.2% ( -22% -   24%) 0.852
       BrowseDayOfYearSSDVFacets        2.88     (34.4%)        2.83     (15.4%)   -1.8% ( -38% -   73%) 0.945
                      AndHighLow      532.13      (1.7%)      526.10      (3.4%)   -1.1% (  -6% -    4%) 0.674
                     AndHighHigh       39.57      (6.2%)       39.20      (1.3%)   -0.9% (  -7% -    7%) 0.834
                      OrHighHigh       63.20      (2.8%)       62.78      (2.9%)   -0.7% (  -6% -    5%) 0.814
                        PKLookup      152.77      (2.4%)      152.01      (0.3%)   -0.5% (  -3% -    2%) 0.770
           HighTermDayOfYearSort       70.31      (0.4%)       70.05      (6.7%)   -0.4% (  -7% -    6%) 0.939
                     LowSpanNear       12.96      (2.6%)       12.94      (0.7%)   -0.1% (  -3% -    3%) 0.944
                       LowPhrase       23.84      (1.2%)       23.82      (1.1%)   -0.1% (  -2% -    2%) 0.945
                 LowSloppyPhrase        7.93      (5.2%)        7.94      (0.2%)    0.2% (  -4% -    5%) 0.966
            HighTermTitleBDVSort        9.53     (10.7%)        9.55      (9.3%)    0.2% ( -17% -   22%) 0.985
                    OrHighNotMed      162.09      (4.9%)      162.41      (1.2%)    0.2% (  -5% -    6%) 0.957
                          IntNRQ       48.11      (5.4%)       48.34      (5.0%)    0.5% (  -9% -   11%) 0.928
                        Wildcard       18.14      (0.5%)       18.25      (1.7%)    0.6% (  -1% -    2%) 0.654
           BrowseMonthTaxoFacets        2.28      (0.3%)        2.30      (4.4%)    0.7% (  -3% -    5%) 0.829
                    HighSpanNear        5.88      (3.4%)        5.94      (5.2%)    1.1% (  -7% -   10%) 0.799
                      HighPhrase       11.88      (2.3%)       12.05      (3.8%)    1.5% (  -4% -    7%) 0.638
                          Fuzzy2       42.79      (0.7%)       43.45     (15.6%)    1.6% ( -14% -   17%) 0.888
                         LowTerm      462.46      (2.5%)      471.91      (5.4%)    2.0% (  -5% -   10%) 0.628
                         Prefix3      201.07      (4.2%)      205.62      (0.7%)    2.3% (  -2% -    7%) 0.454
                   OrNotHighHigh      192.75      (1.4%)      197.39      (3.8%)    2.4% (  -2% -    7%) 0.396
             LowIntervalsOrdered       20.77      (3.2%)       21.28      (4.0%)    2.5% (  -4% -    9%) 0.499
                      AndHighMed      125.26      (6.8%)      128.38      (7.2%)    2.5% ( -10% -   17%) 0.722
          OrHighMedDayTaxoFacets        5.56      (3.2%)        5.70      (1.1%)    2.5% (  -1% -    7%) 0.293
                     MedSpanNear       41.21      (2.4%)       42.62      (3.3%)    3.4% (  -2% -    9%) 0.239
        AndHighHighDayTaxoFacets        3.89      (3.3%)        4.03      (0.8%)    3.4% (   0% -    7%) 0.150
            HighIntervalsOrdered        6.43      (0.4%)        6.69      (0.3%)    3.9% (   3% -    4%) 0.000
     BrowseRandomLabelTaxoFacets        1.72      (4.0%)        1.79      (1.9%)    4.1% (  -1% -   10%) 0.192
                      TermDTSort       76.67      (2.1%)       79.95      (3.8%)    4.3% (  -1% -   10%) 0.165
                       MedPhrase       37.46      (4.2%)       39.26      (0.9%)    4.8% (   0% -   10%) 0.112
                       OrHighMed      170.61      (3.8%)      180.07      (8.8%)    5.5% (  -6% -   18%) 0.413
                    OrNotHighLow      406.32      (4.2%)      429.70      (0.8%)    5.8% (   0% -   11%) 0.058
                        HighTerm      277.15      (1.1%)      293.83      (5.3%)    6.0% (   0% -   12%) 0.118
               HighTermMonthSort      440.76      (6.9%)      474.15      (1.8%)    7.6% (  -1% -   17%) 0.131
                          Fuzzy1       30.28      (8.7%)       32.72     (20.1%)    8.0% ( -19% -   40%) 0.604
                 MedSloppyPhrase       48.44      (5.4%)       52.45      (0.4%)    8.3% (   2% -   14%) 0.030
            BrowseDateTaxoFacets        2.14     (14.9%)        2.32      (9.4%)    8.5% ( -13% -   38%) 0.495
       BrowseDayOfYearTaxoFacets        1.98      (2.2%)        2.15     (13.6%)    8.7% (  -6% -   24%) 0.375
             MedIntervalsOrdered        1.63      (2.2%)        1.79      (2.1%)    9.8% (   5% -   14%) 0.000
                         Respell       27.76      (8.9%)       30.85      (6.1%)   11.1% (  -3% -   28%) 0.144
                          IntSet      262.93      (6.0%)      292.36      (0.3%)   11.2% (   4% -   18%) 0.009
                    OrNotHighMed      134.43     (20.1%)      149.87      (9.4%)   11.5% ( -15% -   51%) 0.465
     BrowseRandomLabelSSDVFacets        1.84     (12.0%)        2.07      (0.2%)   12.4% (   0% -   28%) 0.144
           BrowseMonthSSDVFacets        3.06     (21.2%)        3.98     (71.7%)   30.0% ( -51% -  155%) 0.570

My approach is to improve PointRangeQuery performance when intra-segment search is enabled, as part of stabilizing the intra-segment work and eliminate per-segment work across segment partitions

Nov 25 '25 14:11 prudhvigodithi

Thanks! Below are the results without enabling intra-segment search (on both lucene_candidate and lucene_baseline), which reflects the current behavior on main.

That's not what I meant, I wanted this PR with intra segment search with the current main in order to answer the question what are the benefits of using this against current main?

Nov 25 '25 16:11 iverase

 IntNRQ Wildcard Prefix3 OrHighNotMed OrHighNotHigh HighTermMonthSort HighPhrase HighTermDayOfYearSort HighTerm IntSet LowTerm HighTermTitleSort OrNotHighHigh MedTerm TermDTSort OrHighMed OrNotHighMed Fuzzy1 MedPhrase OrNotHighLow AndHighLow OrHighLow AndHighMedDayTaxoFacets AndHighHighDayTaxoFacets OrHighHigh BrowseDateSSDVFacets MedTermDayTaxoFacets BrowseDateTaxoFacets

MedSloppyPhrase LowPhrase HighSloppyPhrase OrHighMedDayTaxoFacets BrowseDayOfYearTaxoFacets Respell AndHighHigh BrowseRandomLabelTaxoFacets LowIntervalsOrdered HighTermTitleBDVSort HighIntervalsOrdered Fuzzy2 OrHighNotLow MedSpanNear HighSpanNear BrowseMonthTaxoFacets BrowseRandomLabelSSDVFacets LowSpanNear MedIntervalsOrdered PKLookup BrowseDayOfYearSSDVFacets BrowseMonthSSDVFacets LowSloppyPhrase AndHighMed

Enable intra segment

 Prefix3 Wildcard OrHighNotHigh TermDTSort HighTermDayOfYearSort HighTermMonthSort HighTermTitleSort OrNotHighHigh HighTerm OrHighNotMed MedTerm IntNRQ LowTerm IntSet OrNotHighMed MedPhrase OrHighNotLow Fuzzy2 AndHighMed MedTermDayTaxoFacets OrHighLow OrNotHighLow LowPhrase BrowseRandomLabelSSDVFacets BrowseDayOfYearTaxoFacets OrHighMed AndHighHighDayTaxoFacets MedSpanNear Fuzzy1 AndHighMedDayTaxoFacets AndHighLow OrHighMedDayTaxoFacets HighSloppyPhrase MedSloppyPhrase BrowseDateSSDVFacets AndHighHigh OrHighHigh PKLookup MedIntervalsOrdered LowSloppyPhrase LowIntervalsOrdered BrowseMonthTaxoFacets HighIntervalsOrdered HighSpanNear BrowseRandomLabelTaxoFacets

BrowseDateTaxoFacets HighTermTitleBDVSort HighPhrase LowSpanNear BrowseDayOfYearSSDVFacets BrowseMonthSSDVFacets Respell

Enable intra segment

 PKLookup HighTermTitleBDVSort AndHighMedDayTaxoFacets HighIntervalsOrdered Wildcard MedIntervalsOrdered BrowseRandomLabelTaxoFacets OrHighLow OrHighNotMed LowSpanNear MedTermDayTaxoFacets BrowseRandomLabelSSDVFacets OrNotHighLow LowIntervalsOrdered HighSpanNear LowSloppyPhrase MedPhrase HighTermDayOfYearSort MedTerm HighTerm OrNotHighMed OrHighNotLow MedSpanNear OrHighMed OrNotHighHigh AndHighHighDayTaxoFacets MedSloppyPhrase HighTermTitleSort IntSet AndHighMed AndHighLow BrowseDayOfYearSSDVFacets HighTermMonthSort OrHighHigh HighSloppyPhrase OrHighNotHigh AndHighHigh TermDTSort LowPhrase Fuzzy2 Prefix3 LowTerm BrowseMonthTaxoFacets BrowseMonthSSDVFacets OrHighMedDayTaxoFacets BrowseDayOfYearTaxoFacets Respell HighPhrase BrowseDateTaxoFacets

Fuzzy1 BrowseDateSSDVFacets IntNRQ

Enable intra segment on candidate, baseline using `main` disabled intra segment (the following run is without this PR optimization)

TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value 58.91 (1.9%) 13.90 (0.3%) -76.4% ( -77% - -75%) 0.000 43.88 (4.0%) 10.57 (1.0%) -75.9% ( -77% - -73%) 0.000 281.97 (7.8%) 76.48 (1.1%) -72.9% ( -75% - -69%) 0.000 311.22 (6.4%) 110.80 (2.9%) -64.4% ( -69% - -58%) 0.000 148.47 (10.9%) 53.37 (2.5%) -64.1% ( -69% - -56%) 0.000 519.10 (5.2%) 196.53 (1.8%) -62.1% ( -65% - -58%) 0.000 19.04 (2.7%) 8.24 (1.6%) -56.7% ( -59% - -53%) 0.000 76.20 (4.7%) 33.62 (1.3%) -55.9% ( -59% - -52%) 0.000 342.27 (11.9%) 159.15 (9.2%) -53.5% ( -66% - -36%) 0.000 242.49 (6.3%) 138.24 (1.2%) -43.0% ( -47% - -37%) 0.000 438.69 (15.4%) 265.03 (12.7%) -39.6% ( -58% - -13%) 0.000 20.93 (3.3%) 12.79 (0.6%) -38.9% ( -41% - -36%) 0.000 237.69 (11.0%) 151.04 (2.9%) -36.5% ( -45% - -25%) 0.000 456.48 (11.3%) 313.47 (11.4%) -31.3% ( -48% - -9%) 0.000 43.22 (6.8%) 31.36 (1.6%) -27.5% ( -33% - -20%) 0.000 193.23 (9.7%) 147.57 (7.6%) -23.6% ( -37% - -7%) 0.000 146.14 (6.9%) 113.02 (4.3%) -22.7% ( -31% - -12%) 0.000 48.10 (10.0%) 38.63 (5.6%) -19.7% ( -32% - -4%) 0.000 113.33 (2.2%) 92.06 (2.2%) -18.8% ( -22% - -14%) 0.000 547.98 (3.6%) 446.60 (3.9%) -18.5% ( -25% - -11%) 0.000 535.11 (5.4%) 440.58 (3.0%) -17.7% ( -24% - -9%) 0.000 414.70 (5.7%) 342.19 (4.4%) -17.5% ( -26% - -7%) 0.000 15.84 (4.7%) 13.24 (4.5%) -16.4% ( -24% - -7%) 0.000 16.50 (5.5%) 13.92 (1.0%) -15.6% ( -20% - -9%) 0.000 73.41 (13.6%) 62.44 (14.1%) -14.9% ( -37% - 14%) 0.087 0.62 (10.2%) 0.54 (10.0%) -13.5% ( -30% - 7%) 0.035 13.08 (6.8%) 11.35 (4.9%) -13.2% ( -23% - -1%) 0.000 2.31 (21.7%) 2.02 (4.8%) -12.8% ( -32% - 17%) 0.199 range 3310.16 (2.1%) 2956.59 (6.0%) -10.7% ( -18% - -2%) 0.000 33.09 (1.7%) 30.04 (3.1%) -9.2% ( -13% - -4%) 0.000 19.78 (2.4%) 18.15 (1.3%) -8.2% ( -11% - -4%) 0.000 6.08 (3.3%) 5.58 (2.3%) -8.1% ( -13% - -2%) 0.000 5.88 (7.5%) 5.41 (4.9%) -8.1% ( -19% - 4%) 0.044 2.29 (17.9%) 2.13 (12.1%) -6.8% ( -31% - 28%) 0.479 24.52 (11.1%) 22.91 (14.1%) -6.6% ( -28% - 20%) 0.410 72.56 (9.5%) 68.81 (13.6%) -5.2% ( -25% - 19%) 0.486 1.80 (4.2%) 1.71 (5.4%) -5.1% ( -14% - 4%) 0.100 42.00 (2.0%) 40.00 (4.6%) -4.8% ( -11% - 1%) 0.034 11.16 (5.0%) 10.64 (4.5%) -4.6% ( -13% - 5%) 0.123 10.10 (6.3%) 9.76 (6.6%) -3.4% ( -15% - 10%) 0.411 35.54 (4.9%) 34.46 (7.0%) -3.0% ( -14% - 9%) 0.429 371.12 (9.4%) 363.35 (8.9%) -2.1% ( -18% - 17%) 0.717 109.73 (1.9%) 107.88 (2.9%) -1.7% ( -6% - 3%) 0.281 5.56 (3.9%) 5.68 (3.8%) 2.3% ( -5% - 10%) 0.353 2.21 (11.4%) 2.27 (5.3%) 2.5% ( -12% - 21%) 0.651 1.97 (9.5%) 2.04 (13.3%) 3.4% ( -17% - 28%) 0.639 6.17 (4.4%) 6.42 (4.9%) 4.2% ( -4% - 14%) 0.153 13.25 (4.4%) 14.03 (5.5%) 5.9% ( -3% - 16%) 0.060 133.54 (12.5%) 144.12 (5.0%) 7.9% ( -8% - 29%) 0.188 2.66 (14.4%) 2.87 (19.8%) 8.1% ( -22% - 49%) 0.460 2.97 (11.2%) 3.26 (17.0%) 9.6% ( -16% - 42%) 0.292 14.93 (5.2%) 16.68 (3.9%) 11.8% ( 2% - 21%) 0.000 157.26 (8.8%) 183.92 (4.6%) 17.0% ( 3% - 33%) 0.000 on candidate, baseline using main disabled intra segment (the following run is with this PR optimization) TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value 55.06 (0.0%) 12.45 (0.0%) -77.4% ( -77% - -77%) 1.000 80.75 (0.0%) 20.50 (0.0%) -74.6% ( -74% - -74%) 1.000 177.12 (0.0%) 51.88 (0.0%) -70.7% ( -70% - -70%) 1.000 83.40 (0.0%) 28.37 (0.0%) -66.0% ( -65% - -65%) 1.000 79.30 (0.0%) 27.37 (0.0%) -65.5% ( -65% - -65%) 1.000 518.20 (0.0%) 204.98 (0.0%) -60.4% ( -60% - -60%) 1.000 21.02 (0.0%) 8.51 (0.0%) -59.5% ( -59% - -59%) 1.000 85.37 (0.0%) 35.75 (0.0%) -58.1% ( -58% - -58%) 1.000 369.78 (0.0%) 182.48 (0.0%) -50.7% ( -50% - -50%) 1.000 159.12 (0.0%) 86.82 (0.0%) -45.4% ( -45% - -45%) 1.000 417.55 (0.0%) 235.22 (0.0%) -43.7% ( -43% - -43%) 1.000 50.11 (0.0%) 29.95 (0.0%) -40.2% ( -40% - -40%) 1.000 876.98 (0.0%) 531.99 (0.0%) -39.3% ( -39% - -39%) 1.000 217.87 (0.0%) 134.06 (0.0%) -38.5% ( -38% - -38%) 1.000 109.80 (0.0%) 75.63 (0.0%) -31.1% ( -31% - -31%) 1.000 74.25 (0.0%) 53.28 (0.0%) -28.2% ( -28% - -28%) 1.000 423.90 (0.0%) 314.32 (0.0%) -25.8% ( -25% - -25%) 1.000 46.24 (0.0%) 36.17 (0.0%) -21.8% ( -21% - -21%) 1.000 161.59 (0.0%) 132.41 (0.0%) -18.1% ( -18% - -18%) 1.000 14.40 (0.0%) 12.07 (0.0%) -16.2% ( -16% - -16%) 1.000 392.75 (0.0%) 330.70 (0.0%) -15.8% ( -15% - -15%) 1.000 560.04 (0.0%) 471.98 (0.0%) -15.7% ( -15% - -15%) 1.000 55.39 (0.0%) 47.97 (0.0%) -13.4% ( -13% - -13%) 1.000 2.36 (0.0%) 2.07 (0.0%) -12.1% ( -12% - -12%) 1.000 2.33 (0.0%) 2.06 (0.0%) -11.7% ( -11% - -11%) 1.000 192.51 (0.0%) 170.16 (0.0%) -11.6% ( -11% - -11%) 1.000 8.70 (0.0%) 7.71 (0.0%) -11.4% ( -11% - -11%) 1.000 137.19 (0.0%) 122.43 (0.0%) -10.8% ( -10% - -10%) 1.000 35.18 (0.0%) 31.72 (0.0%) -9.9% ( -9% - -9%) 1.000 19.99 (0.0%) 18.07 (0.0%) -9.6% ( -9% - -9%) 1.000 475.14 (0.0%) 444.08 (0.0%) -6.5% ( -6% - -6%) 1.000 2.36 (0.0%) 2.25 (0.0%) -4.6% ( -4% - -4%) 1.000 5.86 (0.0%) 5.65 (0.0%) -3.5% ( -3% - -3%) 1.000 51.54 (0.0%) 49.88 (0.0%) -3.2% ( -3% - -3%) 1.000 0.49 (0.0%) 0.48 (0.0%) -2.7% ( -2% - -2%) 1.000 69.88 (0.0%) 68.22 (0.0%) -2.4% ( -2% - -2%) 1.000 46.76 (0.0%) 45.70 (0.0%) -2.3% ( -2% - -2%) 1.000 155.07 (0.0%) 151.70 (0.0%) -2.2% ( -2% - -2%) 1.000 9.78 (0.0%) 9.71 (0.0%) -0.7% ( 0% - 0%) 1.000 12.42 (0.0%) 12.58 (0.0%) 1.3% ( 1% - 1%) 1.000 9.69 (0.0%) 9.82 (0.0%) 1.3% ( 1% - 1%) 1.000 2.26 (0.0%) 2.30 (0.0%) 1.6% ( 1% - 1%) 1.000 7.66 (0.0%) 7.91 (0.0%) 3.2% ( 3% - 3%) 1.000 3.84 (0.0%) 3.98 (0.0%) 3.6% ( 3% - 3%) 1.000 1.72 (0.0%) 1.79 (0.0%) 3.9% ( 3% - 3%) 1.000 range 2895.37 (0.0%) 3030.59 (0.0%) 4.7% ( 4% - 4%) 1.000 2.03 (0.0%) 2.18 (0.0%) 7.4% ( 7% - 7%) 1.000 5.69 (0.0%) 6.19 (0.0%) 8.7% ( 8% - 8%) 1.000 114.97 (0.0%) 127.61 (0.0%) 11.0% ( 11% - 11%) 1.000 8.80 (0.0%) 9.84 (0.0%) 11.9% ( 11% - 11%) 1.000 3.07 (0.0%) 3.56 (0.0%) 16.2% ( 16% - 16%) 1.000 2.85 (0.0%) 3.37 (0.0%) 18.4% ( 18% - 18%) 1.000 17.39 (0.0%) 24.42 (0.0%) 40.4% ( 40% - 40%) 1.000 on baseline and candidate and using this PR optimization TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value 153.07 (0.2%) 111.11 (4.4%) -27.4% ( -31% - -22%) 0.000 8.90 (3.4%) 8.30 (3.7%) -6.7% ( -13% - 0%) 0.056 65.36 (4.8%) 61.21 (0.8%) -6.3% ( -11% - 0%) 0.066 4.95 (4.7%) 4.64 (0.3%) -6.2% ( -10% - -1%) 0.065 149.60 (2.6%) 140.62 (1.4%) -6.0% ( -9% - -2%) 0.004 27.11 (10.0%) 25.56 (0.2%) -5.7% ( -14% - 4%) 0.417 1.76 (7.1%) 1.67 (1.7%) -5.2% ( -13% - 3%) 0.310 177.49 (3.9%) 168.38 (1.7%) -5.1% ( -10% - 0%) 0.089 125.49 (4.9%) 119.42 (4.2%) -4.8% ( -13% - 4%) 0.292 37.85 (2.4%) 36.31 (1.7%) -4.1% ( -7% - 0%) 0.047 12.65 (3.1%) 12.23 (8.0%) -3.3% ( -13% - 8%) 0.584 2.18 (26.1%) 2.12 (23.9%) -3.0% ( -42% - 63%) 0.905 575.10 (1.6%) 559.18 (1.6%) -2.8% ( -5% - 0%) 0.079 17.79 (5.3%) 17.33 (0.0%) -2.6% ( -7% - 2%) 0.493 10.59 (1.2%) 10.34 (0.6%) -2.3% ( -4% - 0%) 0.016 7.31 (0.4%) 7.16 (1.5%) -2.0% ( -3% - 0%) 0.062 26.16 (2.7%) 25.65 (1.8%) -2.0% ( -6% - 2%) 0.384 24.80 (0.7%) 24.38 (1.1%) -1.7% ( -3% - 0%) 0.059 385.18 (0.8%) 379.60 (2.2%) -1.4% ( -4% - 1%) 0.392 246.26 (0.4%) 242.70 (0.6%) -1.4% ( -2% - 0%) 0.003 57.72 (5.0%) 56.91 (4.4%) -1.4% ( -10% - 8%) 0.767 179.56 (8.6%) 177.42 (1.5%) -1.2% ( -10% - 9%) 0.847 18.34 (3.3%) 18.17 (1.2%) -0.9% ( -5% - 3%) 0.712 159.74 (1.0%) 158.67 (0.0%) -0.7% ( -1% - 0%) 0.319 103.55 (3.7%) 102.95 (5.3%) -0.6% ( -9% - 8%) 0.899 3.47 (1.7%) 3.46 (0.2%) -0.3% ( -2% - 1%) 0.794 9.25 (0.4%) 9.23 (3.6%) -0.2% ( -4% - 3%) 0.926 10.68 (6.1%) 10.68 (1.2%) -0.1% ( -6% - 7%) 0.988 143.67 (1.7%) 143.65 (2.4%) -0.0% ( -4% - 4%) 0.996 191.95 (2.3%) 192.16 (1.5%) 0.1% ( -3% - 4%) 0.955 534.50 (3.1%) 535.67 (2.7%) 0.2% ( -5% - 6%) 0.940 2.88 (12.0%) 2.89 (25.2%) 0.5% ( -32% - 42%) 0.981 202.36 (2.9%) 203.85 (14.0%) 0.7% ( -15% - 18%) 0.942 58.36 (0.7%) 58.86 (0.6%) 0.8% ( 0% - 2%) 0.196 20.06 (0.5%) 20.23 (0.6%) 0.8% ( 0% - 1%) 0.131 67.11 (68.0%) 67.78 (57.7%) 1.0% ( -74% - 395%) 0.987 83.60 (0.8%) 84.53 (1.5%) 1.1% ( -1% - 3%) 0.354 26.40 (2.0%) 26.75 (1.5%) 1.3% ( -2% - 4%) 0.449 134.22 (7.9%) 136.19 (0.4%) 1.5% ( -6% - 10%) 0.794 29.68 (4.5%) 30.19 (2.5%) 1.7% ( -5% - 9%) 0.634 54.66 (2.4%) 56.57 (7.9%) 3.5% ( -6% - 14%) 0.548 563.86 (1.0%) 584.85 (3.7%) 3.7% ( 0% - 8%) 0.171 2.06 (0.1%) 2.15 (19.9%) 4.4% ( -15% - 24%) 0.757 2.86 (17.6%) 3.02 (5.9%) 5.5% ( -15% - 35%) 0.673 2.94 (0.7%) 3.11 (10.8%) 5.7% ( -5% - 17%) 0.457 2.12 (1.4%) 2.26 (36.9%) 6.4% ( -31% - 45%) 0.807 26.71 (3.1%) 28.55 (1.4%) 6.9% ( 2% - 11%) 0.004 3.58 (0.3%) 3.83 (12.7%) 7.1% ( -5% - 20%) 0.434 2.23 (9.4%) 2.43 (21.2%) 9.3% ( -19% - 43%) 0.572 range 2598.11 (7.1%) 2860.33 (5.9%) 10.1% ( -2% - 24%) 0.121 37.34 (10.1%) 43.48 (3.4%) 16.5% ( 2% - 33%) 0.029 0.50 (7.1%) 0.65 (3.1%) 30.5% ( 18% - 43%) 0.000 11.37 (1.0%) 27.37 (9.7%) 140.7% ( 128% - 152%) 0.000

Nov 25 '25 19:11 prudhvigodithi

what are the benefits of using this against current main?

Here are some important results https://github.com/apache/lucene/pull/15446#issuecomment-3577215055. I can clearly see this PR change helped to reduce the regression with PointRangeQuery when intra segment search is enabled, but still disabling intra segment search showed faster results.

Nov 25 '25 19:11 prudhvigodithi

I think you are focusing in the wrong things. Yes, this change makes intra segment concurrency to suck less, but it still sucks and it is still unusable. It is still 40% slower that the concurrent segment search! We should never ever block search threads.

IMO we should focus in how we do to search the data in a segment concurrently instead.

Nov 26 '25 06:11 iverase

IMO we should focus in how we do to search the data in a segment concurrently instead.

With the current BKD setup for PointRangeQuery any thoughts or suggestion on this? FYI I did the same experiment for PointInSetQuery showed the same results where it made intra segment concurrency less painful.

Nov 26 '25 13:11 prudhvigodithi

Since the benchmark results are part of multiple comments (https://github.com/apache/lucene/pull/15446#issuecomment-3577215055, https://github.com/apache/lucene/pull/15446#issuecomment-3576074285, https://github.com/apache/lucene/pull/15446#issue-3657117120), following is overall summary for IntNRQ.

IntNRQ Benchmark Results

Scenario	Baseline QPS	Baseline StdDev	Modified QPS	Modified StdDev	% Diff	p-value
1. Intra-segment disabled on both (current `main`)	69.12	5.0%	69.07	3.3%	-0.1% ( -7% - 8%)	0.985
2. Intra-segment: Candidate enabled, baseline disabled (without PR optimization)	127.75	0.2%	30.92	1.1%	-75.8% (-76% - -74%)	0.000
3. Intra-segment: Candidate enabled, baseline disabled (with PR optimization)	50.11	0.0%	29.95	0.0%	-40.2% (-40% - -40%)	1.000
4. Intra-segment enabled on both (with PR optimization)	12.30	3.8%	30.18	8.2%	+145.3% (128% - 163%)	0.000

Nov 26 '25 13:11 prudhvigodithi

With the current BKD setup for PointRangeQuery any thoughts or suggestion on this?

In my opinion we cannot achieve it with the current segment layout. We need to evolve lucene segments / data structures so they can be searched concurrently.

Nov 26 '25 15:11 iverase

In my opinion we cannot achieve it with the current segment layout. We need to evolve lucene segments / data structures so they can be searched concurrently.

I will try to once again see the bottleneck in my current PR to further improve PointRangeQuery with intra segment, in this case should we continue to iterate and merge this PR as its definitely better than current state ?

Nov 26 '25 16:11 prudhvigodithi

In my opinion we should not merge this PR, sorry. It makes no sense to have all this complexity that provides no benefit. If someone else think otherwise I am not going to block it but I don't like this approach.

Nov 26 '25 16:11 iverase

It makes no sense to have all those complexity that provides no benefit.

Thanks for your overall feedback and I'm open for discussion and thoughts.

I don’t fully agree that it provides no benefit as existing implementation for PointRangeQuery currently on main with intra segment has regressions https://github.com/apache/lucene/pull/13542#issuecomment-2332114836. Looking at overall intra segment concept is genuinely useful. I’d like to build on this and apply similar gains to PointRangeQuery as well.

Nov 26 '25 16:11 prudhvigodithi

The effort isn't lost - at least we know the benchmarks. I also am not particularly fond of this concurrent cache and blocking approach.

Nov 26 '25 17:11 dweiss

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

Dec 12 '25 00:12 github-actions[bot]

Optimize `PointRangeQuery` for intra-segment concurrency with segment-level `DocIdSet` caching

Description

Problem

Solution

Performance Impact: Seen good improvement with IntNRQ

Related Issues

Visual Flow

Enable intra segment on candidate, baseline using main disabled intra segment (the following run is without this PR optimization)

IntNRQ Benchmark Results

Performance Impact: Seen good improvement with `IntNRQ`

Enable intra segment on candidate, baseline using `main` disabled intra segment (the following run is without this PR optimization)