Remove HitsThresholdChecker.
Our top-docs collectors have a dependency on HitsThresholdChecker, which is essentially a shared counter that is incremented until it reaches the total hits threshold, when the scorer can start dynamically pruning hits.
A consequence of this removal is that dynamic pruning may start later, as soon as:
- either the current slice collected
totalHitsThresholdhits, - or another slice collected
totalHitsThresholdhits and the current slice collected enough hits (up to 1,024) to check the sharedMaxScoreAccumulator.
So in short, it exchanges a bit more work globally in favor of a bit less contention. A longer-term goal of mine is to stop specializing our CollectorManagers based on whether they are going to be used concurrently or not.
EDIT: at the time of this comment, only TopScoreDocCollector no longer used HitsThresholdChecker.
wikibigall with a searchConcurrency of 8 suggests that the slowdown is tiny:
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
AndHighLow 849.71 (3.3%) 826.18 (2.2%) -2.8% ( -7% - 2%) 0.002
HighTermDayOfYearSort 253.62 (3.2%) 246.72 (3.0%) -2.7% ( -8% - 3%) 0.005
TermDTSort 202.82 (3.4%) 198.21 (4.4%) -2.3% ( -9% - 5%) 0.069
HighTermTitleBDVSort 49.87 (5.1%) 49.01 (5.5%) -1.7% ( -11% - 9%) 0.306
OrHighRare 245.78 (8.9%) 242.01 (9.2%) -1.5% ( -17% - 18%) 0.591
And2Terms2StopWords 184.35 (5.3%) 181.84 (4.5%) -1.4% ( -10% - 8%) 0.379
AndHighHigh 102.63 (6.9%) 101.25 (5.9%) -1.3% ( -13% - 12%) 0.507
CountTerm 8520.99 (3.9%) 8417.05 (5.1%) -1.2% ( -9% - 8%) 0.396
Wildcard 112.90 (5.6%) 111.56 (4.8%) -1.2% ( -10% - 9%) 0.471
Fuzzy1 79.74 (1.7%) 79.07 (1.7%) -0.8% ( -4% - 2%) 0.114
OrHighLow 724.26 (2.3%) 719.70 (2.2%) -0.6% ( -5% - 3%) 0.377
Or2Terms2StopWords 192.42 (5.4%) 191.24 (4.0%) -0.6% ( -9% - 9%) 0.680
And3Terms 176.74 (5.3%) 175.76 (4.1%) -0.6% ( -9% - 9%) 0.712
CountOrHighHigh 88.76 (4.9%) 88.34 (4.4%) -0.5% ( -9% - 9%) 0.744
HighTermMonthSort 1066.76 (1.6%) 1062.79 (1.9%) -0.4% ( -3% - 3%) 0.506
Fuzzy2 75.06 (1.4%) 74.85 (1.8%) -0.3% ( -3% - 3%) 0.597
CountOrHighMed 137.67 (5.3%) 137.45 (4.4%) -0.2% ( -9% - 10%) 0.920
OrHighNotHigh 196.77 (3.3%) 196.66 (3.5%) -0.1% ( -6% - 6%) 0.959
HighTermTitleSort 70.08 (6.6%) 70.09 (5.7%) 0.0% ( -11% - 13%) 0.994
OrHighHigh 94.07 (4.8%) 94.12 (5.1%) 0.0% ( -9% - 10%) 0.975
AndHighMed 182.18 (4.1%) 182.43 (3.4%) 0.1% ( -7% - 8%) 0.909
OrNotHighMed 255.11 (2.8%) 255.50 (3.3%) 0.2% ( -5% - 6%) 0.874
OrHighMed 242.11 (2.4%) 242.65 (2.5%) 0.2% ( -4% - 5%) 0.772
OrNotHighHigh 235.61 (2.3%) 236.26 (3.4%) 0.3% ( -5% - 6%) 0.766
HighTerm 361.55 (2.6%) 362.84 (2.7%) 0.4% ( -4% - 5%) 0.669
MedTerm 453.24 (2.8%) 455.07 (2.5%) 0.4% ( -4% - 5%) 0.628
OrHighNotMed 317.00 (3.0%) 318.40 (3.8%) 0.4% ( -6% - 7%) 0.680
PKLookup 277.48 (2.2%) 278.76 (2.7%) 0.5% ( -4% - 5%) 0.558
OrMany 46.17 (2.3%) 46.41 (2.9%) 0.5% ( -4% - 5%) 0.520
Prefix3 68.55 (4.0%) 69.01 (4.7%) 0.7% ( -7% - 9%) 0.627
OrHighNotLow 336.53 (3.2%) 339.73 (3.9%) 1.0% ( -5% - 8%) 0.395
AndStopWords 64.81 (5.3%) 65.46 (5.3%) 1.0% ( -9% - 12%) 0.543
LowTerm 640.08 (3.1%) 647.88 (2.6%) 1.2% ( -4% - 7%) 0.176
CountAndHighHigh 74.37 (5.2%) 75.37 (5.6%) 1.3% ( -8% - 12%) 0.426
CountAndHighMed 161.25 (5.1%) 163.59 (5.7%) 1.4% ( -8% - 12%) 0.394
OrNotHighLow 865.87 (3.3%) 880.11 (2.7%) 1.6% ( -4% - 7%) 0.081
Or3Terms 175.34 (4.3%) 178.28 (4.9%) 1.7% ( -7% - 11%) 0.252
OrStopWords 69.26 (6.5%) 70.71 (6.4%) 2.1% ( -10% - 15%) 0.303
IntNRQ 166.81 (5.5%) 170.67 (10.5%) 2.3% ( -12% - 19%) 0.381
Actually, while I was at it, I also removed TopFieldCollector's dependency on HitsThresholdChecker, and then removed HitsThresholdChecker.
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
And2Terms2StopWords 189.12 (4.5%) 182.81 (4.2%) -3.3% ( -11% - 5%) 0.015
AndStopWords 67.12 (4.6%) 66.00 (3.8%) -1.7% ( -9% - 7%) 0.214
OrStopWords 70.79 (6.9%) 69.97 (7.2%) -1.1% ( -14% - 13%) 0.607
CountTerm 8165.89 (5.6%) 8090.18 (3.8%) -0.9% ( -9% - 8%) 0.538
Wildcard 133.39 (4.1%) 132.24 (3.3%) -0.9% ( -7% - 6%) 0.468
Or2Terms2StopWords 194.66 (5.4%) 193.42 (4.1%) -0.6% ( -9% - 9%) 0.673
OrHighMed 198.12 (4.1%) 197.08 (3.5%) -0.5% ( -7% - 7%) 0.663
And3Terms 180.61 (4.0%) 179.74 (3.9%) -0.5% ( -8% - 7%) 0.700
CountAndHighMed 163.62 (4.5%) 163.28 (3.5%) -0.2% ( -7% - 8%) 0.872
Fuzzy1 78.71 (2.3%) 78.63 (1.8%) -0.1% ( -4% - 4%) 0.881
AndHighMed 160.23 (4.9%) 160.16 (4.4%) -0.0% ( -8% - 9%) 0.974
OrHighHigh 103.71 (4.1%) 103.68 (4.2%) -0.0% ( -8% - 8%) 0.981
OrMany 46.85 (2.4%) 46.85 (2.7%) -0.0% ( -5% - 5%) 0.996
CountOrHighMed 139.92 (4.2%) 139.96 (4.7%) 0.0% ( -8% - 9%) 0.984
PKLookup 278.34 (2.4%) 278.47 (2.5%) 0.0% ( -4% - 5%) 0.950
Or3Terms 176.21 (3.8%) 176.43 (2.9%) 0.1% ( -6% - 7%) 0.904
Fuzzy2 74.40 (2.5%) 74.65 (2.0%) 0.3% ( -4% - 4%) 0.647
CountAndHighHigh 76.54 (4.8%) 76.82 (4.1%) 0.4% ( -8% - 9%) 0.798
AndHighHigh 96.45 (5.3%) 96.82 (4.3%) 0.4% ( -8% - 10%) 0.800
OrNotHighHigh 200.33 (3.1%) 201.39 (4.6%) 0.5% ( -6% - 8%) 0.667
AndHighLow 1013.59 (2.4%) 1019.06 (2.4%) 0.5% ( -4% - 5%) 0.480
HighTermTitleSort 72.60 (3.3%) 73.08 (3.5%) 0.7% ( -5% - 7%) 0.540
Prefix3 191.90 (5.3%) 193.31 (4.5%) 0.7% ( -8% - 11%) 0.635
OrHighNotHigh 198.74 (2.8%) 200.40 (4.1%) 0.8% ( -5% - 7%) 0.452
IntNRQ 125.30 (11.9%) 126.54 (9.9%) 1.0% ( -18% - 25%) 0.776
OrHighNotMed 281.15 (2.4%) 284.02 (5.0%) 1.0% ( -6% - 8%) 0.407
HighTermDayOfYearSort 255.46 (3.2%) 258.19 (3.8%) 1.1% ( -5% - 8%) 0.333
CountOrHighHigh 90.76 (5.7%) 91.74 (5.1%) 1.1% ( -9% - 12%) 0.529
OrNotHighMed 326.08 (3.7%) 329.84 (4.8%) 1.2% ( -7% - 10%) 0.396
OrHighLow 786.85 (1.8%) 797.01 (2.7%) 1.3% ( -3% - 5%) 0.076
OrHighNotLow 368.40 (2.5%) 373.79 (5.3%) 1.5% ( -6% - 9%) 0.265
HighTermTitleBDVSort 57.32 (4.3%) 58.16 (4.0%) 1.5% ( -6% - 10%) 0.268
OrNotHighLow 1096.17 (3.0%) 1114.86 (3.0%) 1.7% ( -4% - 7%) 0.071
HighTerm 338.49 (2.5%) 347.01 (2.5%) 2.5% ( -2% - 7%) 0.001
LowTerm 663.36 (3.3%) 682.20 (3.4%) 2.8% ( -3% - 9%) 0.008
OrHighRare 246.60 (6.2%) 253.61 (7.7%) 2.8% ( -10% - 17%) 0.198
MedTerm 433.32 (3.0%) 445.99 (3.1%) 2.9% ( -3% - 9%) 0.003
TermDTSort 196.52 (3.4%) 219.52 (4.9%) 11.7% ( 3% - 20%) 0.000
HighTermMonthSort 1082.56 (2.7%) 1404.25 (3.4%) 29.7% ( 22% - 36%) 0.000
Queries sorted by field seem to benefit from the lesser synchronization across search threads.
Wow, it's been a bigger speedup on nightly benchmarks than on my machine: https://benchmarks.mikemccandless.com/And3Terms.html.