lucene LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

https://issues.apache.org/jira/browse/LUCENE-10366 #11402

Today, we do not rewrite #readVInt and #readVLong for ByteBufferIndexInput. By default, the logic will call #readByte several times, and we need to check whether ByteBuffer is valid every time. This may not be necessary as we just need a final check.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
       BrowseDayOfYearSSDVFacets       16.74     (17.3%)       15.91     (12.3%)   -5.0% ( -29% -   29%) 0.295
            MedTermDayTaxoFacets       27.01      (6.9%)       26.56      (5.9%)   -1.7% ( -13% -   11%) 0.402
                        Wildcard      111.55      (8.1%)      109.67      (7.6%)   -1.7% ( -16% -   15%) 0.499
                         Respell       58.06      (2.6%)       57.20      (2.6%)   -1.5% (  -6% -    3%) 0.074
          OrHighMedDayTaxoFacets        8.91      (4.7%)        8.81      (7.2%)   -1.1% ( -12% -   11%) 0.557
                          Fuzzy1      117.17      (3.8%)      116.14      (3.3%)   -0.9% (  -7% -    6%) 0.437
                          Fuzzy2      103.70      (3.2%)      102.82      (4.3%)   -0.9% (  -8% -    6%) 0.472
            HighIntervalsOrdered       10.11      (7.9%)       10.05      (7.4%)   -0.6% ( -14% -   15%) 0.797
           HighTermDayOfYearSort      183.18      (8.8%)      182.92     (10.8%)   -0.1% ( -18% -   21%) 0.964
        AndHighHighDayTaxoFacets       11.44      (3.8%)       11.43      (3.1%)   -0.1% (  -6% -    7%) 0.936
                         Prefix3      161.90     (13.5%)      161.80     (13.3%)   -0.1% ( -23% -   30%) 0.989
                    HighSpanNear       11.43      (4.8%)       11.45      (4.2%)    0.1% (  -8% -    9%) 0.928
                        PKLookup      220.15      (3.3%)      220.69      (6.2%)    0.2% (  -8% -   10%) 0.874
                     MedSpanNear       92.60      (4.0%)       93.11      (3.7%)    0.5% (  -6% -    8%) 0.656
                      TermDTSort      143.26      (9.0%)      144.14     (10.9%)    0.6% ( -17% -   22%) 0.847
             MedIntervalsOrdered       63.74      (6.6%)       64.21      (6.1%)    0.8% ( -11% -   14%) 0.707
            HighTermTitleBDVSort       99.61      (9.1%)      100.49     (12.4%)    0.9% ( -18% -   24%) 0.796
                     LowSpanNear      126.43      (3.6%)      127.61      (3.2%)    0.9% (  -5% -    8%) 0.383
             LowIntervalsOrdered       12.45      (5.4%)       12.58      (5.2%)    1.0% (  -9% -   12%) 0.535
                         LowTerm     1767.08      (3.7%)     1788.83      (3.1%)    1.2% (  -5% -    8%) 0.257
                HighSloppyPhrase       11.45      (7.0%)       11.61      (7.1%)    1.5% ( -11% -   16%) 0.515
         AndHighMedDayTaxoFacets       69.41      (3.7%)       70.46      (2.8%)    1.5% (  -4% -    8%) 0.147
     BrowseRandomLabelSSDVFacets       10.85      (6.1%)       11.04      (5.1%)    1.7% (  -9% -   13%) 0.342
                         MedTerm     2083.04      (5.3%)     2119.48      (5.7%)    1.7% (  -8% -   13%) 0.316
                 LowSloppyPhrase      148.79      (3.6%)      151.76      (3.2%)    2.0% (  -4% -    9%) 0.062
                      HighPhrase       98.67      (3.4%)      100.80      (3.5%)    2.2% (  -4% -    9%) 0.048
                    OrHighNotLow     1371.31      (7.1%)     1400.91      (7.9%)    2.2% ( -12% -   18%) 0.365
           BrowseMonthTaxoFacets       16.65     (11.6%)       17.03     (13.1%)    2.2% ( -20% -   30%) 0.565
                   OrHighNotHigh     1267.37      (6.8%)     1297.42      (8.9%)    2.4% ( -12% -   19%) 0.344
                 MedSloppyPhrase       39.35      (3.6%)       40.42      (4.2%)    2.7% (  -4% -   10%) 0.028
                   OrNotHighHigh     1190.01      (6.6%)     1224.72      (7.6%)    2.9% ( -10% -   18%) 0.194
                      OrHighHigh       37.72      (4.3%)       39.00      (3.4%)    3.4% (  -4% -   11%) 0.005
                     AndHighHigh       92.46      (4.5%)       95.76      (4.9%)    3.6% (  -5% -   13%) 0.017
                    OrHighNotMed     1231.31      (6.3%)     1275.65      (7.9%)    3.6% (  -9% -   18%) 0.109
                       OrHighMed      174.32      (3.8%)      181.43      (2.9%)    4.1% (  -2% -   11%) 0.000
                      AndHighLow     2761.91     (10.7%)     2885.28     (10.1%)    4.5% ( -14% -   28%) 0.175
                       MedPhrase      214.87      (4.9%)      224.55      (4.8%)    4.5% (  -4% -   14%) 0.003
                       LowPhrase      333.03      (3.8%)      348.43      (3.6%)    4.6% (  -2% -   12%) 0.000
               HighTermMonthSort      159.92      (9.8%)      167.50     (14.8%)    4.7% ( -18% -   32%) 0.232
                    OrNotHighMed      973.50      (6.0%)     1022.10      (6.0%)    5.0% (  -6% -   18%) 0.008
     BrowseRandomLabelTaxoFacets       13.14     (11.1%)       13.83     (14.0%)    5.3% ( -17% -   34%) 0.186
                        HighTerm     1682.54      (7.0%)     1786.66      (7.5%)    6.2% (  -7% -   22%) 0.007
                      AndHighMed      277.66      (3.6%)      295.75      (4.3%)    6.5% (  -1% -   14%) 0.000
            BrowseDateTaxoFacets       14.81     (12.9%)       15.78     (17.1%)    6.6% ( -20% -   42%) 0.170
       BrowseDayOfYearTaxoFacets       14.90     (13.1%)       15.93     (17.4%)    6.9% ( -20% -   43%) 0.158
                    OrNotHighLow     1255.52      (7.0%)     1344.26      (7.1%)    7.1% (  -6% -   22%) 0.002
                       OrHighLow      972.15      (5.3%)     1056.13      (4.8%)    8.6% (  -1% -   19%) 0.000
                          IntNRQ       99.91     (26.0%)      108.80     (19.2%)    8.9% ( -28% -   73%) 0.219
           BrowseMonthSSDVFacets       18.62     (20.7%)       20.45     (25.0%)    9.8% ( -29% -   70%) 0.175

Jan 10 '22 07:01 gf2121

I think there is another issue with this PR, which is that if a vInt is written across two ByteBuffers, then your PR would only ensure the validity of one of these ByteBuffers, while I think it should check both for safety.

Jan 10 '22 17:01 jpountz

Thanks @jpountz ! I did not realize that reading ummapped bytebuffer will make JVM crach, sorry!

we should keep calling ensureValid before reading any byte from the ByteBuffer while your PR does it afterwards. I would also prefer keeping access to the ByteBuffers encapsulated into ByteBufferGuard.

Sorry for my poor english, i'm not sure i've got your point. So to confirm, which one is you are meaning?

This approach need to be changed to implement a ByteBufferGuard#getVInt that just ensureValid once before readVInt (also consider the issue that vint exists across several bytebuffers).
We need to EnsureValid for each byte read, like what we did before. so this approach need to be given up and some other ways (like using codecs other than vints for some of the tasks) should be tried.

I'm having this question because i have not understood why todays ensurevalid can protect readings on bytebuffer, it seems there should be a case that (thread 1)ensurevalid -> (thread 2)unmap -> (thread 1)read unmapped. Is the Thread.yield() helping solve this case? Will this still work for a heavier operation like readVInt?

Jan 11 '22 06:01 gf2121

I checked some other usage, seeing that for operations like bytebuffer#readBytes we also just checked once. If the length <= 6, DirectByteBuffer will read byte one by one, then it should be similar to the readVInt logic. So based on this i guess ensureValid once can work for vint? I implemented this and tests got passed too.

Jan 11 '22 07:01 gf2121

Thanks @gf2121 the implementation looks correct to me now. Are you still seeing a good speedup with this change?

Jan 11 '22 09:01 jpountz

The problem here is that we just reducing the number of guard checks and at same time raising the risk to segv: ReadVint uses multiple atomic reads and all of those reads can segv.

Sorry to tell this: I would not change anything here, it is too risky. Let's wait for MemorySegmentIndexInput, which has JVM internal checks and a guard is not needed. see #518

Jan 11 '22 09:01 uschindler

Thanks @jpountz @uschindler , this is the benchmark result based on the newest codes

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
            HighTermTitleBDVSort      168.66     (13.3%)      163.17     (11.2%)   -3.3% ( -24% -   24%) 0.404
           BrowseMonthSSDVFacets       20.35     (25.8%)       19.83     (25.1%)   -2.6% ( -42% -   65%) 0.751
           BrowseMonthTaxoFacets       17.31     (14.9%)       16.96     (14.8%)   -2.1% ( -27% -   32%) 0.661
            MedTermDayTaxoFacets       48.47      (4.8%)       47.52      (4.1%)   -2.0% ( -10% -    7%) 0.159
       BrowseDayOfYearSSDVFacets       16.71     (14.5%)       16.41     (14.9%)   -1.8% ( -27% -   32%) 0.702
            HighIntervalsOrdered       25.70     (10.9%)       25.33     (11.0%)   -1.5% ( -21% -   22%) 0.674
                 MedSloppyPhrase       21.62      (4.8%)       21.38      (3.1%)   -1.1% (  -8% -    7%) 0.393
                      TermDTSort      238.94     (13.0%)      236.98     (12.8%)   -0.8% ( -23% -   28%) 0.841
                HighSloppyPhrase       39.84      (4.1%)       39.60      (2.9%)   -0.6% (  -7% -    6%) 0.588
       BrowseDayOfYearTaxoFacets       15.89     (15.4%)       15.83     (16.6%)   -0.4% ( -28% -   37%) 0.933
                         LowTerm     1698.75      (3.5%)     1691.63      (3.4%)   -0.4% (  -7% -    6%) 0.702
            BrowseDateTaxoFacets       15.72     (15.3%)       15.67     (16.5%)   -0.4% ( -27% -   37%) 0.942
          OrHighMedDayTaxoFacets       15.19      (4.3%)       15.14      (4.9%)   -0.3% (  -9% -    9%) 0.813
                         Respell       81.95      (2.6%)       81.68      (2.7%)   -0.3% (  -5% -    5%) 0.694
                        PKLookup      227.94      (4.3%)      227.37      (5.3%)   -0.2% (  -9% -    9%) 0.870
     BrowseRandomLabelSSDVFacets       10.93      (6.6%)       10.92      (5.8%)   -0.1% ( -11% -   13%) 0.955
                          IntNRQ      144.63     (22.4%)      144.48     (22.3%)   -0.1% ( -36% -   57%) 0.989
                          Fuzzy2       88.59      (2.3%)       88.51      (2.2%)   -0.1% (  -4% -    4%) 0.897
             LowIntervalsOrdered       93.64      (6.0%)       93.72      (5.9%)    0.1% ( -11% -   12%) 0.962
                          Fuzzy1      137.32      (3.1%)      137.58      (2.7%)    0.2% (  -5% -    6%) 0.833
             MedIntervalsOrdered       17.06      (6.3%)       17.09      (6.4%)    0.2% ( -11% -   13%) 0.922
           HighTermDayOfYearSort       82.27      (8.1%)       82.50     (11.8%)    0.3% ( -18% -   21%) 0.930
     BrowseRandomLabelTaxoFacets       13.63     (12.7%)       13.68     (13.3%)    0.3% ( -22% -   30%) 0.934
        AndHighHighDayTaxoFacets        6.39      (2.6%)        6.42      (2.9%)    0.5% (  -4% -    6%) 0.578
                 LowSloppyPhrase       55.91      (2.7%)       56.43      (1.8%)    0.9% (  -3% -    5%) 0.193
                         MedTerm     2031.88      (3.7%)     2054.31      (4.9%)    1.1% (  -7% -   10%) 0.419
                         Prefix3      129.38      (9.3%)      130.94      (8.8%)    1.2% ( -15% -   21%) 0.676
                     AndHighHigh       67.64      (4.4%)       68.46      (4.5%)    1.2% (  -7% -   10%) 0.390
                        Wildcard      105.85      (9.5%)      107.23      (8.8%)    1.3% ( -15% -   21%) 0.652
                    HighSpanNear       12.80      (3.8%)       13.01      (4.0%)    1.7% (  -5% -    9%) 0.177
               HighTermMonthSort      130.57      (9.9%)      132.98     (11.1%)    1.8% ( -17% -   25%) 0.579
                     MedSpanNear       73.79      (3.4%)       75.30      (2.7%)    2.0% (  -3% -    8%) 0.035
                      OrHighHigh       53.48      (5.1%)       54.77      (4.4%)    2.4% (  -6% -   12%) 0.106
                     LowSpanNear       16.05      (3.5%)       16.47      (3.6%)    2.6% (  -4% -   10%) 0.018
                   OrHighNotHigh     1162.70      (3.6%)     1194.55      (4.4%)    2.7% (  -5% -   11%) 0.030
                    OrNotHighMed     1428.26      (3.0%)     1468.20      (4.8%)    2.8% (  -4% -   10%) 0.027
                        HighTerm     2587.85      (4.5%)     2662.11      (5.2%)    2.9% (  -6% -   13%) 0.061
                       OrHighMed      182.60      (4.8%)      188.40      (4.3%)    3.2% (  -5% -   12%) 0.026
                   OrNotHighHigh     1079.06      (3.2%)     1114.06      (4.6%)    3.2% (  -4% -   11%) 0.010
                    OrHighNotMed     1278.47      (3.8%)     1320.86      (4.4%)    3.3% (  -4% -   12%) 0.011
                      AndHighMed      170.34      (2.9%)      176.45      (3.7%)    3.6% (  -2% -   10%) 0.001
                    OrHighNotLow      932.47      (3.6%)      976.66      (5.1%)    4.7% (  -3% -   14%) 0.001
                      HighPhrase      168.16      (3.0%)      176.27      (3.0%)    4.8% (  -1% -   11%) 0.000
                       LowPhrase       48.60      (2.6%)       50.98      (2.5%)    4.9% (   0% -   10%) 0.000
         AndHighMedDayTaxoFacets       58.86      (2.6%)       62.10      (2.6%)    5.5% (   0% -   11%) 0.000
                       MedPhrase      145.00      (3.1%)      153.94      (2.3%)    6.2% (   0% -   11%) 0.000
                       OrHighLow     1078.49      (3.5%)     1149.09      (3.1%)    6.5% (   0% -   13%) 0.000
                      AndHighLow     1149.40      (4.3%)     1275.69      (2.9%)   11.0% (   3% -   18%) 0.000
                    OrNotHighLow     1765.89      (6.2%)     2019.83      (4.4%)   14.4% (   3% -   26%) 0.000

Sorry to tell this: I would not change anything here, it is too risky. Let's wait for MemorySegmentIndexInput, which has JVM internal checks and a guard is not needed.

OK! I'll close this then.

The benchmark above shows no significant changes and it looks like ran with wrong JVM options (defaults by Mike were bad, fixed recently). It should run with tiered compilation enabled then the guard checks should go away.

Yes i did not run with tiered compilation, but in fact i'm running with the newest luceneutil. It seems we used to add this param but removed again a few days ago? See this commit: https://github.com/mikemccand/luceneutil/commit/f48b53858fa664c8d1f9637935add0ac42f28252

Jan 11 '22 12:01 gf2121

The param must go away or change - to +. Mike's commit fixed the tiered compilation problem and I was not sure if you have used his latest commit.

To get better predicatable results that are not so noisy (especially for those low-level changes) you should have a huge dataset (not only wikiMedium) and also run it with more queries/run. What you see here is mostly noise.

Jan 11 '22 12:01 uschindler

@uschindler Thank you for the benchmark guidance!

I made a new change in this commit that ensure valid before each call of ByteBuffer#get. And I run a benchmark on wikimediumall (16GB) with 20 rounds of JVM and each round repeat tasks 200 times (each jvm last 10 min and total time took 6h+). This benchmark should be similar to what we did in https://github.com/apache/lucene/pull/518. And here is the result:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           BrowseMonthTaxoFacets        5.80     (14.4%)        5.50     (13.6%)   -5.1% ( -29% -   26%) 0.246
            HighTermTitleBDVSort       35.69     (30.2%)       34.46     (18.5%)   -3.4% ( -40% -   64%) 0.664
                         Prefix3      194.68     (12.3%)      188.12     (13.2%)   -3.4% ( -25% -   25%) 0.403
               HighTermMonthSort       49.11     (28.0%)       47.66     (15.0%)   -2.9% ( -35% -   55%) 0.678
                      OrHighHigh       16.37      (4.5%)       15.99      (4.0%)   -2.4% ( -10% -    6%) 0.078
                      TermDTSort       22.81     (23.4%)       22.30     (12.8%)   -2.3% ( -31% -   44%) 0.704
       BrowseDayOfYearSSDVFacets        4.43     (10.4%)        4.34      (7.1%)   -2.0% ( -17% -   17%) 0.473
            HighIntervalsOrdered        3.92      (9.6%)        3.85      (8.4%)   -1.9% ( -18% -   17%) 0.514
                        Wildcard       95.75     (14.1%)       94.10     (14.6%)   -1.7% ( -26% -   31%) 0.704
     BrowseRandomLabelTaxoFacets        4.55     (12.8%)        4.50     (12.2%)   -1.1% ( -23% -   27%) 0.788
                       OrHighMed       47.41      (4.7%)       46.98      (4.1%)   -0.9% (  -9% -    8%) 0.516
            BrowseDateTaxoFacets        5.25     (16.3%)        5.21     (15.4%)   -0.8% ( -27% -   37%) 0.881
       BrowseDayOfYearTaxoFacets        5.31     (16.5%)        5.27     (15.6%)   -0.7% ( -28% -   37%) 0.893
          OrHighMedDayTaxoFacets        1.03      (5.0%)        1.02      (5.0%)   -0.6% ( -10% -    9%) 0.700
            MedTermDayTaxoFacets       15.74      (5.8%)       15.67      (3.8%)   -0.5% (  -9% -    9%) 0.772
                          Fuzzy1       76.87      (2.4%)       76.53      (2.1%)   -0.4% (  -4% -    4%) 0.542
                        PKLookup      204.50      (5.0%)      203.69      (4.8%)   -0.4% (  -9% -    9%) 0.798
                          Fuzzy2       71.89      (2.1%)       71.64      (2.0%)   -0.4% (  -4% -    3%) 0.586
                 LowSloppyPhrase        4.00      (4.2%)        3.99      (4.0%)   -0.3% (  -8% -    8%) 0.833
                         Respell       46.07      (1.8%)       45.96      (2.1%)   -0.2% (  -4% -    3%) 0.686
             LowIntervalsOrdered        7.73      (4.9%)        7.73      (3.7%)   -0.0% (  -8% -    9%) 0.988
           HighTermDayOfYearSort       43.42     (23.3%)       43.54     (12.8%)    0.3% ( -29% -   47%) 0.965
        AndHighHighDayTaxoFacets        6.99      (3.0%)        7.02      (2.6%)    0.5% (  -4% -    6%) 0.544
     BrowseRandomLabelSSDVFacets        3.18      (6.5%)        3.21      (7.6%)    1.1% ( -12% -   16%) 0.637
                    HighSpanNear        1.70      (5.8%)        1.72      (4.9%)    1.1% (  -9% -   12%) 0.524
                     LowSpanNear        5.93      (3.4%)        6.00      (3.4%)    1.1% (  -5% -    8%) 0.312
                 MedSloppyPhrase       45.85      (2.9%)       46.42      (2.7%)    1.2% (  -4% -    6%) 0.161
                     MedSpanNear       23.88      (4.0%)       24.26      (3.8%)    1.6% (  -5% -    9%) 0.198
                         LowTerm     1699.14      (4.1%)     1729.13      (2.9%)    1.8% (  -4% -    9%) 0.114
           BrowseMonthSSDVFacets        4.67      (4.1%)        4.76     (12.0%)    2.0% ( -13% -   18%) 0.483
                       MedPhrase       47.09      (2.2%)       48.24      (1.5%)    2.4% (  -1% -    6%) 0.000
                HighSloppyPhrase       14.85      (2.6%)       15.22      (2.2%)    2.5% (  -2% -    7%) 0.001
             MedIntervalsOrdered        7.07      (3.6%)        7.29      (2.5%)    3.2% (  -2% -    9%) 0.001
                       LowPhrase       12.80      (2.4%)       13.27      (1.7%)    3.6% (   0% -    7%) 0.000
                     AndHighHigh       35.29      (3.1%)       36.71      (5.0%)    4.0% (  -4% -   12%) 0.002
                      AndHighMed       87.69      (2.4%)       91.35      (3.9%)    4.2% (  -2% -   10%) 0.000
                   OrHighNotHigh     1514.29      (4.5%)     1578.30      (4.0%)    4.2% (  -4% -   13%) 0.002
                         MedTerm     1838.55      (4.7%)     1917.70      (4.1%)    4.3% (  -4% -   13%) 0.002
                    OrNotHighMed     1182.37      (3.8%)     1237.57      (2.7%)    4.7% (  -1% -   11%) 0.000
                    OrHighNotLow     1218.07      (6.1%)     1279.72      (4.8%)    5.1% (  -5% -   17%) 0.004
         AndHighMedDayTaxoFacets       20.90      (3.4%)       21.98      (2.0%)    5.2% (   0% -   11%) 0.000
                   OrNotHighHigh      943.89      (5.8%)      999.13      (4.8%)    5.9% (  -4% -   17%) 0.001
                        HighTerm     1349.44      (5.8%)     1431.38      (5.3%)    6.1% (  -4% -   18%) 0.001
                    OrHighNotMed      926.10      (5.8%)      989.92      (5.6%)    6.9% (  -4% -   19%) 0.000
                      HighPhrase      290.29      (6.0%)      318.57      (3.1%)    9.7% (   0% -   20%) 0.000
                          IntNRQ       32.61     (25.8%)       35.87     (16.2%)   10.0% ( -25% -   70%) 0.142
                       OrHighLow      547.59      (6.0%)      604.39      (3.2%)   10.4% (   1% -   20%) 0.000
                    OrNotHighLow      886.23      (5.1%)      980.83      (2.2%)   10.7% (   3% -   18%) 0.000
                      AndHighLow      608.29      (6.3%)      693.67      (3.5%)   14.0% (   4% -   25%) 0.000

So it seems like the point that brings a speed up is not reducing number of valid check but something else ? Anyway, this approach is no longer raising the risk of segv and benchmark result seems positive so i reopened the PR to see if this will make sense to you now :)

Jan 12 '22 04:01 gf2121

Hi @gf2121, thanks for taking the time to test more. It looks really like the guard checks are not affecting the whole thing, but as always we don't really know where the optimization effects are coming from. For such low level changes it often also varies between different JVM version (there are differences between JDK 11 and JDK 17). Actually the code SHOULD not make any difference, but it has.

I see that you carefully also save the position before the try block and restore it on boundary crossing exception. What I don't like: ByteBufferGuard is just a hack and should not contain any program code except guarding. So I am not happy of including readVInt code there.

Have you tried in just moving the Guard#readVInt() code into ByteBufferIndexInput#readVInt() inside its try block calling guard.readByte() -- all guard methods should be inlined by JVM very early? This is where it belongs! Nevertheless I wonder why Hotspot is not able to apply the optimization. The only thing we spare is the try/catch, but maybe thats the issue here that prevents it from being inlined into readVInt.

If that helps we can also try the same on the MemorySegment one from #518.

Jan 12 '22 08:01 uschindler

I think about this:

@Override
  public final int readVInt() throws IOException {
    try {
      // using #position instead of #mark here as #position is a final method
      int pos = curBuf.position();
      try {
        byte b = guard.get(curBuf);
        if (b >= 0) return b;
        int i = b & 0x7F;
        b = guard.get(curBuf);
        i |= (b & 0x7F) << 7;
        if (b >= 0) return i;
        b = guard.get(curBuf);
        i |= (b & 0x7F) << 14;
        if (b >= 0) return i;
        b = guard.get(curBuf);
        i |= (b & 0x7F) << 21;
        if (b >= 0) return i;
        b = guard.get(curBuf);
        // Warning: the next ands use 0x0F / 0xF0 - beware copy/paste errors:
        i |= (b & 0x0F) << 28;
        if ((b & 0xF0) == 0) return i;
        throw new IOException("Invalid vInt detected (too many bits)");
      } catch (
          @SuppressWarnings("unused")
          BufferUnderflowException e) {
        curBuf.position(pos);
        return super.readVInt();
      }
    } catch (
        @SuppressWarnings("unused")
        NullPointerException npe) {
      throw new AlreadyClosedException("Already closed: " + this);
    }
  }

Jan 12 '22 08:01 uschindler

I wonder if one reason why this helps is because this method is so large that it is prevented from inlining sub function calls. The JVM bug that affected vint/vlong reads no longer affects any of the JVM versions we support to my knowledge, maybe we should consider rolling up these loops again (assuming it helps performance).

Jan 12 '22 10:01 jpountz

I wonder if one reason why this helps is because this method is so large that it is prevented from inlining sub function calls. The JVM bug that affected vint/vlong reads no longer affects any of the JVM versions we support to my knowledge, maybe we should consider rolling up these loops again (assuming it helps performance).

We only loose the last check on last iteration (the "copypaste" statement).

Jan 12 '22 14:01 uschindler

We should maybe convert all of them back to loops in DataInput base class and remove the specializations and do more tests. I think the change here is then completely obsolete.

Jan 12 '22 14:01 uschindler

+1 let's make a separate change to look at re-rolling the loop back. the JVM bug in question doesn't affect this version of lucene (or probably the last one either?), it was from ancient java days: I think java 1.7.x?

Jan 12 '22 18:01 rmuir

Thanks all! I've finished benchmarks (20 JVM * 200 repeat):

Read byte from guard

Here is the code

                           TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
            HighTermTitleBDVSort       60.79     (18.2%)       56.94     (14.0%)   -6.3% ( -32% -   31%) 0.217
                          IntNRQ      207.39     (11.4%)      198.01     (16.6%)   -4.5% ( -29% -   26%) 0.316
                        PKLookup      206.16      (5.4%)      197.27      (6.1%)   -4.3% ( -15% -    7%) 0.018
       BrowseDayOfYearTaxoFacets        4.93     (14.5%)        4.78     (12.8%)   -2.9% ( -26% -   28%) 0.506
     BrowseRandomLabelTaxoFacets        4.24     (10.9%)        4.12      (7.9%)   -2.9% ( -19% -   17%) 0.340
            BrowseDateTaxoFacets        4.87     (14.3%)        4.74     (12.6%)   -2.8% ( -26% -   28%) 0.508
          OrHighMedDayTaxoFacets        2.59      (4.8%)        2.55      (4.9%)   -1.6% ( -10% -    8%) 0.310
               HighTermMonthSort       78.30     (16.3%)       77.27     (12.7%)   -1.3% ( -26% -   33%) 0.777
            HighIntervalsOrdered        5.97      (8.5%)        5.90      (9.8%)   -1.1% ( -17% -   18%) 0.708
            MedTermDayTaxoFacets       18.18      (3.3%)       18.00      (4.0%)   -1.0% (  -8% -    6%) 0.377
           BrowseMonthSSDVFacets        4.71      (6.8%)        4.67      (5.6%)   -0.9% ( -12% -   12%) 0.662
                      OrHighHigh       17.15      (4.3%)       17.00      (4.4%)   -0.8% (  -9% -    8%) 0.538
                         Respell       39.54      (2.3%)       39.21      (3.0%)   -0.8% (  -5% -    4%) 0.334
                    HighSpanNear        7.88      (5.6%)        7.84      (6.0%)   -0.6% ( -11% -   11%) 0.755
     BrowseRandomLabelSSDVFacets        3.18      (6.2%)        3.16      (6.1%)   -0.6% ( -12% -   12%) 0.776
                          Fuzzy2       64.38      (2.9%)       64.32      (3.4%)   -0.1% (  -6% -    6%) 0.935
           BrowseMonthTaxoFacets        5.14     (16.6%)        5.14     (15.2%)   -0.0% ( -27% -   38%) 0.996
                          Fuzzy1      103.02      (2.8%)      103.11      (3.1%)    0.1% (  -5% -    6%) 0.930
                         LowTerm     1856.79      (2.4%)     1858.93      (2.2%)    0.1% (  -4% -    4%) 0.875
             LowIntervalsOrdered       54.54      (7.0%)       54.64      (8.9%)    0.2% ( -14% -   17%) 0.945
        AndHighHighDayTaxoFacets       12.76      (2.4%)       12.80      (2.2%)    0.3% (  -4% -    5%) 0.694
       BrowseDayOfYearSSDVFacets        4.29      (6.1%)        4.30      (6.0%)    0.3% ( -11% -   13%) 0.859
                        Wildcard       45.26     (12.3%)       45.47     (11.5%)    0.5% ( -20% -   27%) 0.898
                       OrHighMed       72.00      (4.6%)       72.37      (4.1%)    0.5% (  -7% -    9%) 0.712
                HighSloppyPhrase        5.53      (3.4%)        5.56      (4.4%)    0.5% (  -7% -    8%) 0.672
                 MedSloppyPhrase       33.17      (2.8%)       33.36      (3.5%)    0.6% (  -5% -    7%) 0.565
                         MedTerm     1513.55      (4.0%)     1528.19      (3.0%)    1.0% (  -5% -    8%) 0.388
                         Prefix3       89.81     (11.0%)       90.93     (10.1%)    1.3% ( -17% -   25%) 0.707
                     MedSpanNear        7.77      (5.2%)        7.89      (5.5%)    1.5% (  -8% -   12%) 0.369
                 LowSloppyPhrase       42.76      (1.6%)       43.44      (2.6%)    1.6% (  -2% -    5%) 0.023
                    OrNotHighMed     1267.71      (3.0%)     1289.27      (2.5%)    1.7% (  -3% -    7%) 0.051
             MedIntervalsOrdered       26.64      (4.4%)       27.10      (5.4%)    1.7% (  -7% -   12%) 0.271
                    OrHighNotLow     1233.37      (3.8%)     1255.68      (2.9%)    1.8% (  -4% -    8%) 0.095
                    OrHighNotMed     1217.40      (3.9%)     1239.56      (2.9%)    1.8% (  -4% -    9%) 0.096
                   OrNotHighHigh     1125.83      (3.5%)     1152.51      (2.6%)    2.4% (  -3% -    8%) 0.016
                      TermDTSort       57.78     (15.7%)       59.15     (19.5%)    2.4% ( -28% -   44%) 0.671
         AndHighMedDayTaxoFacets       44.66      (2.9%)       45.86      (2.0%)    2.7% (  -2% -    7%) 0.001
                     LowSpanNear       21.46      (3.4%)       22.04      (3.2%)    2.7% (  -3% -    9%) 0.009
                   OrHighNotHigh      989.24      (3.8%)     1017.06      (2.7%)    2.8% (  -3% -    9%) 0.007
           HighTermDayOfYearSort       30.92     (15.3%)       31.82     (13.9%)    2.9% ( -22% -   37%) 0.531
                        HighTerm      864.50      (4.9%)      892.97      (4.0%)    3.3% (  -5% -   12%) 0.020
                     AndHighHigh       34.00      (4.0%)       35.21      (4.9%)    3.6% (  -5% -   12%) 0.011
                       MedPhrase       51.35      (4.1%)       53.23      (3.7%)    3.7% (  -3% -   11%) 0.003
                      AndHighMed       59.83      (4.5%)       62.31      (4.1%)    4.1% (  -4% -   13%) 0.002
                       LowPhrase       71.34      (3.6%)       74.34      (3.0%)    4.2% (  -2% -   11%) 0.000
                       OrHighLow      389.91      (5.2%)      413.76      (3.2%)    6.1% (  -2% -   15%) 0.000
                      HighPhrase      146.45      (6.5%)      155.68      (5.1%)    6.3% (  -4% -   19%) 0.001
                    OrNotHighLow      752.41      (6.5%)      831.54      (2.5%)   10.5% (   1% -   20%) 0.000
                      AndHighLow      794.98      (6.8%)      888.70      (3.6%)   11.8% (   1% -   23%) 0.000

Roll up VInt / Vlong loop

Here is the code

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           BrowseMonthTaxoFacets        5.17     (15.9%)        5.00     (12.1%)   -3.4% ( -27% -   29%) 0.446
                    OrNotHighLow     1010.74      (4.0%)      978.71      (4.6%)   -3.2% ( -11% -    5%) 0.021
                      HighPhrase      171.95      (3.6%)      166.92      (4.6%)   -2.9% ( -10% -    5%) 0.025
                      AndHighLow      594.12      (4.2%)      577.24      (5.4%)   -2.8% ( -11% -    7%) 0.064
                       OrHighLow      540.46      (4.1%)      526.17      (5.4%)   -2.6% ( -11% -    7%) 0.083
          OrHighMedDayTaxoFacets        6.01      (5.3%)        5.88      (3.9%)   -2.2% ( -10% -    7%) 0.136
         AndHighMedDayTaxoFacets       14.78      (2.6%)       14.51      (2.1%)   -1.8% (  -6% -    2%) 0.013
                       MedPhrase      142.26      (2.9%)      139.67      (3.1%)   -1.8% (  -7% -    4%) 0.058
                       LowPhrase       21.22      (2.8%)       20.85      (3.1%)   -1.8% (  -7% -    4%) 0.061
        AndHighHighDayTaxoFacets        4.31      (4.5%)        4.24      (3.2%)   -1.7% (  -8% -    6%) 0.158
       BrowseDayOfYearTaxoFacets        4.70     (17.3%)        4.63     (12.9%)   -1.3% ( -26% -   34%) 0.787
            BrowseDateTaxoFacets        4.65     (16.9%)        4.59     (12.9%)   -1.2% ( -26% -   34%) 0.803
                 MedSloppyPhrase       34.40      (2.9%)       34.02      (4.0%)   -1.1% (  -7% -    5%) 0.318
            MedTermDayTaxoFacets       13.85      (6.7%)       13.70      (4.5%)   -1.0% ( -11% -   10%) 0.563
     BrowseRandomLabelTaxoFacets        4.16     (12.7%)        4.11      (9.7%)   -1.0% ( -20% -   24%) 0.772
                 LowSloppyPhrase        5.77      (2.2%)        5.72      (3.3%)   -0.9% (  -6% -    4%) 0.307
                     LowSpanNear       53.67      (3.6%)       53.22      (3.9%)   -0.8% (  -8% -    6%) 0.481
                    HighSpanNear        2.66      (4.8%)        2.63      (5.4%)   -0.8% ( -10% -    9%) 0.616
             MedIntervalsOrdered       25.88      (9.4%)       25.68      (9.5%)   -0.8% ( -17% -   20%) 0.797
                   OrHighNotHigh     1043.34      (3.7%)     1037.43      (4.4%)   -0.6% (  -8% -    7%) 0.658
                HighSloppyPhrase        1.47      (3.4%)        1.46      (4.2%)   -0.6% (  -7% -    7%) 0.645
                     MedSpanNear       11.52      (3.5%)       11.46      (4.3%)   -0.5% (  -7% -    7%) 0.685
                   OrNotHighHigh     1615.92      (3.4%)     1608.09      (3.6%)   -0.5% (  -7% -    6%) 0.663
     BrowseRandomLabelSSDVFacets        3.11      (6.0%)        3.10      (4.4%)   -0.2% ( -10% -   10%) 0.881
             LowIntervalsOrdered        4.06      (8.9%)        4.06      (8.9%)   -0.2% ( -16% -   19%) 0.957
                    OrHighNotMed     1188.76      (3.8%)     1187.46      (4.4%)   -0.1% (  -7% -    8%) 0.933
                    OrNotHighMed     1220.26      (3.1%)     1219.23      (3.7%)   -0.1% (  -6% -    6%) 0.938
                      AndHighMed      115.92      (3.6%)      116.03      (3.3%)    0.1% (  -6% -    7%) 0.928
                          Fuzzy1      111.98      (3.2%)      112.15      (3.5%)    0.1% (  -6% -    7%) 0.889
            HighIntervalsOrdered        5.14      (7.5%)        5.15      (7.3%)    0.2% ( -13% -   16%) 0.937
                    OrHighNotLow     1222.80      (4.1%)     1226.76      (4.7%)    0.3% (  -8% -    9%) 0.817
                      TermDTSort       51.02     (14.1%)       51.21     (18.9%)    0.4% ( -28% -   38%) 0.944
                        HighTerm     1570.53      (3.7%)     1578.45      (4.4%)    0.5% (  -7% -    8%) 0.693
       BrowseDayOfYearSSDVFacets        4.26      (3.9%)        4.28      (9.1%)    0.5% ( -12% -   14%) 0.811
                     AndHighHigh       40.61      (4.1%)       40.83      (4.1%)    0.5% (  -7% -    9%) 0.681
                         MedTerm     2002.17      (3.6%)     2013.12      (4.3%)    0.5% (  -7% -    8%) 0.659
                         Respell       67.74      (3.8%)       68.14      (3.3%)    0.6% (  -6% -    8%) 0.594
                         LowTerm     1633.26      (2.8%)     1643.86      (2.6%)    0.6% (  -4% -    6%) 0.444
                       OrHighMed       52.80      (3.7%)       53.17      (3.8%)    0.7% (  -6% -    8%) 0.551
           HighTermDayOfYearSort       61.86     (10.2%)       62.31     (13.0%)    0.7% ( -20% -   26%) 0.846
                          Fuzzy2       33.05      (3.1%)       33.31      (3.1%)    0.8% (  -5% -    7%) 0.420
                      OrHighHigh       40.94      (4.1%)       41.27      (3.6%)    0.8% (  -6% -    8%) 0.513
                        PKLookup      204.53      (5.7%)      206.21      (4.9%)    0.8% (  -9% -   12%) 0.623
           BrowseMonthSSDVFacets        4.65      (4.4%)        4.74     (11.8%)    2.0% ( -13% -   19%) 0.488
            HighTermTitleBDVSort       50.18     (11.7%)       51.93     (18.5%)    3.5% ( -23% -   38%) 0.475
                         Prefix3       83.02     (12.3%)       86.16     (12.9%)    3.8% ( -19% -   33%) 0.343
                        Wildcard       61.79      (5.6%)       64.22      (7.4%)    3.9% (  -8% -   17%) 0.058
               HighTermMonthSort       50.81     (16.0%)       53.56     (16.4%)    5.4% ( -23% -   44%) 0.291
                          IntNRQ       56.84     (31.8%)       62.62     (26.8%)   10.2% ( -36% -  100%) 0.275

In addition, this is the java version used in all these benchmarks:

java version "17.0.1" 2021-10-19 LTS Java(TM) SE Runtime Environment (build 17.0.1+12-LTS-39) Java HotSpot(TM) 64-Bit Server VM (build 17.0.1+12-LTS-39, mixed mode, sharing)

Jan 13 '22 04:01 gf2121

Interesting, so it looks like rolling the loop back actually helps a bit. Can you open a new JIRA/PR for this change specifically @gf2121 ?

Jan 13 '22 08:01 jpountz

Thanks @jpountz . I opened https://issues.apache.org/jira/browse/LUCENE-10376 (https://github.com/apache/lucene/pull/602)

It can be seen that IntNRQ is getting a 10% speed up but i'm not very sure if it is a noise since the p-value is high and In several rounds of benchmark here i can see the result of IntNRQ is unstable:

speed up 10% in first benchmark
keep even in second benchmark
speed up 10% again in third benchmark
regress 5% in forth benchmark

On the other hand, the stable speed up of some of tasks (AndHighLow, OrHighLow) can not be seen any more in the loop vint approach. So i'm a bit tempted to continue on this approach(LUCENE-10366), what do you think?

Jan 13 '22 09:01 gf2121

To test if it is the try-catch block preventing inline, I made another experiment: call readByte() instead of guard.get(curBuf) to read byte, like this (Yes it is just a copy of DataInput#readVInt now).

public int readVInt() throws IOException {
    byte b = readByte();
    if (b >= 0) return b;
    int i = b & 0x7F;
    b = readByte();
    i |= (b & 0x7F) << 7;
    if (b >= 0) return i;
    b = readByte();
    i |= (b & 0x7F) << 14;
    if (b >= 0) return i;
    b = readByte();
    i |= (b & 0x7F) << 21;
    if (b >= 0) return i;
    b = readByte();
    // Warning: the next ands use 0x0F / 0xF0 - beware copy/paste errors:
    i |= (b & 0x0F) << 28;
    if ((b & 0xF0) == 0) return i;
    throw new IOException("Invalid vInt detected (too many bits)");
  }

And i'm still seeing a speed up:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
               HighTermMonthSort       17.77     (20.4%)       16.91     (14.0%)   -4.9% ( -32% -   37%) 0.378
            HighTermTitleBDVSort       53.61     (15.2%)       51.18     (12.8%)   -4.5% ( -28% -   27%) 0.308
           BrowseMonthTaxoFacets        5.27     (16.7%)        5.12     (16.5%)   -3.0% ( -31% -   36%) 0.569
           HighTermDayOfYearSort       53.12     (15.6%)       51.56     (12.2%)   -2.9% ( -26% -   29%) 0.506
                          IntNRQ      532.53      (1.8%)      523.43      (9.1%)   -1.7% ( -12% -    9%) 0.411
     BrowseRandomLabelTaxoFacets        4.33     (13.4%)        4.30     (10.5%)   -0.8% ( -21% -   26%) 0.843
            MedTermDayTaxoFacets       13.80      (5.4%)       13.73      (5.5%)   -0.5% ( -10% -   10%) 0.765
                      TermDTSort       88.41     (12.6%)       88.12     (12.9%)   -0.3% ( -22% -   28%) 0.935
       BrowseDayOfYearTaxoFacets        4.96     (18.2%)        4.95     (15.4%)   -0.1% ( -28% -   40%) 0.989
                         Prefix3      250.60     (14.1%)      250.43     (13.9%)   -0.1% ( -24% -   32%) 0.987
            BrowseDateTaxoFacets        4.90     (17.9%)        4.90     (15.1%)    0.0% ( -27% -   40%) 0.999
           BrowseMonthSSDVFacets        4.69      (6.8%)        4.70      (6.5%)    0.2% ( -12% -   14%) 0.940
                          Fuzzy1       60.49      (1.9%)       60.67      (2.2%)    0.3% (  -3% -    4%) 0.655
                        Wildcard       53.25      (6.0%)       53.46      (6.5%)    0.4% ( -11% -   13%) 0.845
                         Respell       51.43      (2.3%)       51.66      (2.4%)    0.4% (  -4% -    5%) 0.544
                       OrHighMed       63.35      (3.2%)       63.73      (5.2%)    0.6% (  -7% -    9%) 0.652
                        PKLookup      206.79      (4.3%)      208.11      (4.5%)    0.6% (  -7% -    9%) 0.648
             LowIntervalsOrdered       11.64      (5.5%)       11.72      (5.7%)    0.7% (  -9% -   12%) 0.708
            HighIntervalsOrdered        2.85      (7.6%)        2.87      (8.1%)    0.7% ( -14% -   17%) 0.790
                          Fuzzy2       66.34      (1.9%)       66.88      (2.7%)    0.8% (  -3% -    5%) 0.271
                    HighSpanNear        6.61      (3.4%)        6.67      (4.1%)    1.0% (  -6% -    8%) 0.395
             MedIntervalsOrdered        8.24      (5.1%)        8.34      (5.6%)    1.2% (  -8% -   12%) 0.467
                HighSloppyPhrase        6.70      (6.4%)        6.78      (6.4%)    1.2% ( -10% -   14%) 0.543
       BrowseDayOfYearSSDVFacets        4.22      (4.7%)        4.27      (4.7%)    1.3% (  -7% -   11%) 0.379
                 MedSloppyPhrase       33.87      (4.5%)       34.34      (4.4%)    1.4% (  -7% -   10%) 0.327
          OrHighMedDayTaxoFacets        5.08      (5.1%)        5.16      (4.4%)    1.6% (  -7% -   11%) 0.302
     BrowseRandomLabelSSDVFacets        3.15      (8.2%)        3.20      (7.3%)    1.6% ( -12% -   18%) 0.519
                         LowTerm     2114.49      (2.6%)     2152.48      (2.7%)    1.8% (  -3% -    7%) 0.031
        AndHighHighDayTaxoFacets        4.41      (3.2%)        4.49      (3.0%)    1.8% (  -4% -    8%) 0.064
                     AndHighHigh       30.57      (4.5%)       31.14      (7.4%)    1.9% (  -9% -   14%) 0.342
                      OrHighHigh       35.60      (3.0%)       36.32      (3.8%)    2.0% (  -4% -    9%) 0.062
                     LowSpanNear       23.21      (2.6%)       23.73      (2.9%)    2.3% (  -3% -    8%) 0.010
                      AndHighMed       65.70      (4.9%)       67.45      (7.9%)    2.7% (  -9% -   16%) 0.200
                     MedSpanNear       77.65      (6.1%)       79.96      (6.9%)    3.0% (  -9% -   17%) 0.151
                 LowSloppyPhrase       36.66      (2.7%)       37.76      (2.3%)    3.0% (  -2% -    8%) 0.000
                         MedTerm     2602.20      (3.3%)     2692.40      (3.7%)    3.5% (  -3% -   10%) 0.002
                    OrHighNotLow     1157.68      (4.7%)     1207.92      (4.7%)    4.3% (  -4% -   14%) 0.004
                       MedPhrase       72.16      (3.0%)       75.49      (2.4%)    4.6% (   0% -   10%) 0.000
                    OrHighNotMed     1074.08      (4.3%)     1126.38      (4.1%)    4.9% (  -3% -   13%) 0.000
                    OrNotHighMed     1093.93      (2.9%)     1157.33      (2.7%)    5.8% (   0% -   11%) 0.000
                       LowPhrase      274.34      (3.2%)      290.37      (1.5%)    5.8% (   1% -   10%) 0.000
                      HighPhrase      281.24      (3.3%)      298.54      (2.0%)    6.2% (   0% -   11%) 0.000
                       OrHighLow      321.86      (2.7%)      342.74      (3.1%)    6.5% (   0% -   12%) 0.000
                   OrNotHighHigh     1016.04      (4.5%)     1084.94      (4.3%)    6.8% (  -1% -   16%) 0.000
                   OrHighNotHigh      785.90      (5.9%)      845.39      (5.1%)    7.6% (  -3% -   19%) 0.000
                        HighTerm     1421.49      (5.5%)     1529.92      (5.8%)    7.6% (  -3% -   20%) 0.000
         AndHighMedDayTaxoFacets       28.01      (3.3%)       30.41      (2.6%)    8.6% (   2% -   14%) 0.000
                    OrNotHighLow      667.21      (4.4%)      736.53      (2.5%)   10.4% (   3% -   18%) 0.000
                      AndHighLow      493.65      (6.5%)      561.58      (4.3%)   13.8% (   2% -   26%) 0.000

This indeed sounds crazy. I suspect if this is because DataInput#readVInt calls DataInput#readByte and the abstraction layer makes JVM confused. While ByteBufferIndexInput#readVInt directly calls ByteBufferIndexInput#readByte (which is declared as final) ?

I got a flame graph on the benchmark and found some vtable stub there (which seems to justify this guess) :

Jan 16 '22 18:01 gf2121

As it can be the same as super, I tried to reuse super logic (call super.xxx). still seeing the speed up:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           BrowseMonthTaxoFacets        5.07     (16.4%)        4.91     (16.0%)   -3.2% ( -30% -   35%) 0.538
                    HighSpanNear        2.19      (4.4%)        2.17      (3.9%)   -1.1% (  -9% -    7%) 0.385
                HighSloppyPhrase        4.31      (5.2%)        4.28      (4.5%)   -0.9% ( -10% -    9%) 0.571
                 MedSloppyPhrase       23.29      (5.5%)       23.12      (5.1%)   -0.7% ( -10% -   10%) 0.669
            MedTermDayTaxoFacets        5.43      (5.4%)        5.39      (4.1%)   -0.7% (  -9% -    9%) 0.657
                          IntNRQ       35.17     (21.7%)       34.97     (21.5%)   -0.6% ( -35% -   54%) 0.933
          OrHighMedDayTaxoFacets        1.38      (6.2%)        1.37      (5.4%)   -0.5% ( -11% -   11%) 0.784
                          Fuzzy1      113.72      (3.5%)      113.41      (3.3%)   -0.3% (  -6% -    6%) 0.804
                          Fuzzy2       79.20      (3.3%)       78.99      (3.2%)   -0.3% (  -6% -    6%) 0.795
                         Respell       54.75      (3.1%)       54.78      (3.0%)    0.0% (  -5% -    6%) 0.962
             MedIntervalsOrdered        5.08      (8.3%)        5.08      (7.4%)    0.1% ( -14% -   17%) 0.979
                        PKLookup      204.91      (5.8%)      205.45      (6.6%)    0.3% ( -11% -   13%) 0.894
        AndHighHighDayTaxoFacets        1.59      (5.8%)        1.60      (5.0%)    0.5% (  -9% -   11%) 0.792
       BrowseDayOfYearSSDVFacets        4.20      (3.6%)        4.22      (3.9%)    0.5% (  -6% -    8%) 0.656
            HighIntervalsOrdered        2.17      (8.3%)        2.18      (7.4%)    0.7% ( -13% -   17%) 0.791
                         Prefix3       74.40      (8.1%)       75.25      (7.9%)    1.1% ( -13% -   18%) 0.652
                         LowTerm     1924.36      (1.9%)     1947.29      (1.8%)    1.2% (  -2% -    4%) 0.040
     BrowseRandomLabelTaxoFacets        4.16     (13.3%)        4.21     (11.2%)    1.3% ( -20% -   29%) 0.742
             LowIntervalsOrdered        6.14      (5.4%)        6.24      (4.5%)    1.8% (  -7% -   12%) 0.258
                     LowSpanNear       36.68      (2.8%)       37.34      (2.6%)    1.8% (  -3% -    7%) 0.037
                     MedSpanNear       10.85      (2.7%)       11.06      (2.6%)    1.9% (  -3% -    7%) 0.025
            BrowseDateTaxoFacets        4.76     (16.6%)        4.86     (14.3%)    2.0% ( -24% -   39%) 0.679
       BrowseDayOfYearTaxoFacets        4.82     (16.8%)        4.91     (14.7%)    2.0% ( -25% -   40%) 0.683
                 LowSloppyPhrase       79.36      (2.7%)       81.09      (2.0%)    2.2% (  -2% -    7%) 0.004
                      TermDTSort       45.15     (11.3%)       46.17     (16.4%)    2.3% ( -22% -   33%) 0.612
           HighTermDayOfYearSort       66.04     (10.3%)       67.57     (15.7%)    2.3% ( -21% -   31%) 0.580
                         MedTerm     1784.03      (2.6%)     1830.98      (2.8%)    2.6% (  -2% -    8%) 0.002
                       OrHighMed       50.97      (4.5%)       52.45      (4.4%)    2.9% (  -5% -   12%) 0.039
     BrowseRandomLabelSSDVFacets        3.17     (11.6%)        3.27     (11.3%)    3.2% ( -17% -   29%) 0.370
                    OrHighNotLow     1390.67      (2.9%)     1436.11      (2.9%)    3.3% (  -2% -    9%) 0.000
                      AndHighMed       99.02      (3.9%)      102.35      (5.2%)    3.4% (  -5% -   12%) 0.020
         AndHighMedDayTaxoFacets       26.52      (3.5%)       27.41      (2.4%)    3.4% (  -2% -    9%) 0.000
                    OrHighNotMed     1155.48      (2.9%)     1195.26      (2.9%)    3.4% (  -2% -    9%) 0.000
                       MedPhrase       25.46      (2.4%)       26.37      (2.6%)    3.6% (  -1% -    8%) 0.000
                       LowPhrase       81.21      (2.5%)       84.18      (2.6%)    3.7% (  -1% -    9%) 0.000
                   OrHighNotHigh     1685.00      (2.6%)     1750.81      (2.6%)    3.9% (  -1% -    9%) 0.000
               HighTermMonthSort       73.01     (11.5%)       76.01     (15.7%)    4.1% ( -20% -   35%) 0.344
                    OrNotHighMed      978.17      (2.5%)     1018.54      (2.8%)    4.1% (  -1% -    9%) 0.000
                        HighTerm     1457.41      (4.0%)     1521.90      (3.8%)    4.4% (  -3% -   12%) 0.000
                      OrHighHigh       38.49      (4.6%)       40.24      (3.8%)    4.5% (  -3% -   13%) 0.001
                   OrNotHighHigh      848.77      (3.1%)      891.09      (3.3%)    5.0% (  -1% -   11%) 0.000
            HighTermTitleBDVSort       56.20     (16.5%)       59.16     (19.5%)    5.3% ( -26% -   49%) 0.356
                      HighPhrase      125.88      (3.3%)      134.24      (3.2%)    6.6% (   0% -   13%) 0.000
           BrowseMonthSSDVFacets        4.57      (2.9%)        4.87     (19.0%)    6.7% ( -14% -   29%) 0.121
                       OrHighLow      433.74      (4.9%)      468.94      (4.3%)    8.1% (  -1% -   18%) 0.000
                     AndHighHigh       74.06      (5.0%)       80.29      (4.0%)    8.4% (   0% -   18%) 0.000
                        Wildcard      113.47     (13.7%)      125.12     (10.1%)   10.3% ( -11% -   39%) 0.007
                    OrNotHighLow      782.47      (5.2%)      865.27      (2.7%)   10.6% (   2% -   19%) 0.000
                      AndHighLow      339.38      (5.7%)      376.62      (4.9%)   11.0% (   0% -   22%) 0.000

Jan 17 '22 06:01 gf2121

To ensure the benchmark is warmed up, I also run a benchmark that repeat AndHighLow task 20000 times (which was usually seen a good speed up in benchmarks before). Here is the report:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      AndHighLow      767.41      (9.7%)      898.67      (1.5%)   17.1% (   5% -   31%) 0.000

Jan 17 '22 07:01 gf2121

I will try out the same on MemorySegmentIndexInput (#518).

Jan 17 '22 10:01 uschindler

I was also thinking about the following: Why not make readVInt/readVLong final in the DataInput class? This should be possible after the refactoring in #602 !

We should also make all other readZigZag and other DataInput utility methods that read specific data formats final. In most cases it makes no sense to implement them for performance reasons.

Jan 17 '22 10:01 uschindler

Thanks @uschindler !

I was also thinking about the following: Why not make readVInt/readVLong final in the DataInput class? This should be possible after the refactoring in #602 !

It seems like that it is the abstraction layer of #readByte (not #readVInt) making JVM confused. I'm not sure if making readVInt final can work, but i'd like to give it a try.

Jan 17 '22 10:01 gf2121

Although it is unrelated, but could you make that one final for consistency? https://github.com/apache/lucene/blob/97d669bcdb5e302a4fd79c648d5f621d90a9ac3b/lucene/core/src/java/org/apache/lucene/store/ByteBufferIndexInput.java#L140

readFloats() is final but not readLongs()

Jan 17 '22 11:01 uschindler

Here is the report of making vint final (50 repeat * 20 JVM):

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
            HighTermTitleBDVSort       66.42     (20.7%)       61.57     (13.7%)   -7.3% ( -34% -   34%) 0.188
                          IntNRQ       31.66     (30.7%)       30.04     (34.0%)   -5.1% ( -53% -   85%) 0.615
               HighTermMonthSort       50.98     (19.0%)       48.37     (14.2%)   -5.1% ( -32% -   34%) 0.334
                      TermDTSort       23.29     (18.5%)       22.35     (15.4%)   -4.0% ( -31% -   36%) 0.453
            HighIntervalsOrdered        7.29     (10.2%)        7.00      (8.0%)   -3.9% ( -20% -   15%) 0.174
           HighTermDayOfYearSort       82.93     (22.1%)       79.98     (10.9%)   -3.6% ( -29% -   37%) 0.519
             MedIntervalsOrdered        8.13      (7.5%)        7.84      (5.1%)   -3.5% ( -14% -    9%) 0.080
            BrowseDateTaxoFacets        4.95     (16.0%)        4.82     (17.4%)   -2.6% ( -31% -   36%) 0.619
       BrowseDayOfYearTaxoFacets        5.01     (16.5%)        4.89     (17.7%)   -2.5% ( -31% -   37%) 0.639
             LowIntervalsOrdered       27.57      (5.6%)       26.97      (4.3%)   -2.2% ( -11% -    8%) 0.169
     BrowseRandomLabelSSDVFacets        3.14      (6.1%)        3.08      (7.0%)   -1.9% ( -14% -   11%) 0.358
                 MedSloppyPhrase       13.28      (5.7%)       13.08      (5.6%)   -1.5% ( -12% -   10%) 0.400
                HighSloppyPhrase        3.26      (4.3%)        3.23      (4.2%)   -1.0% (  -8% -    7%) 0.474
                         Respell       53.28      (2.3%)       52.78      (3.5%)   -0.9% (  -6% -    4%) 0.310
     BrowseRandomLabelTaxoFacets        4.29     (12.2%)        4.26     (13.4%)   -0.8% ( -23% -   28%) 0.846
                 LowSloppyPhrase       13.00      (4.7%)       12.92      (4.8%)   -0.6% (  -9% -    9%) 0.707
            MedTermDayTaxoFacets       15.28      (4.8%)       15.20      (5.0%)   -0.5% (  -9% -    9%) 0.741
                      AndHighLow      612.76      (6.3%)      610.25      (5.6%)   -0.4% ( -11% -   12%) 0.828
                          Fuzzy1       99.32      (3.0%)       99.02      (3.2%)   -0.3% (  -6% -    6%) 0.751
          OrHighMedDayTaxoFacets        2.50      (5.4%)        2.50      (5.2%)   -0.1% ( -10% -   11%) 0.937
        AndHighHighDayTaxoFacets        3.31      (3.9%)        3.31      (3.6%)    0.0% (  -7% -    7%) 0.971
                    OrHighNotLow     1738.10      (5.5%)     1739.49      (3.7%)    0.1% (  -8% -    9%) 0.957
                    OrNotHighLow      947.48      (6.2%)      949.16      (5.0%)    0.2% ( -10% -   12%) 0.920
                    OrHighNotMed     1028.55      (5.1%)     1030.45      (4.4%)    0.2% (  -8% -   10%) 0.903
                        Wildcard      118.98     (11.3%)      119.38     (10.3%)    0.3% ( -19% -   24%) 0.921
         AndHighMedDayTaxoFacets       34.54      (2.9%)       34.65      (2.5%)    0.3% (  -4% -    5%) 0.689
                        HighTerm     1470.90      (5.5%)     1476.02      (3.3%)    0.3% (  -8% -    9%) 0.809
                          Fuzzy2       98.76      (3.0%)       99.17      (2.9%)    0.4% (  -5% -    6%) 0.662
                     LowSpanNear       39.95      (3.0%)       40.15      (3.7%)    0.5% (  -6% -    7%) 0.626
                     AndHighHigh       21.48      (5.6%)       21.59      (4.6%)    0.5% (  -9% -   11%) 0.740
       BrowseDayOfYearSSDVFacets        4.20      (6.7%)        4.23      (5.9%)    0.5% ( -11% -   14%) 0.786
           BrowseMonthSSDVFacets        4.64      (6.5%)        4.67      (6.8%)    0.6% ( -11% -   14%) 0.778
                         LowTerm     1732.57      (3.5%)     1744.74      (2.6%)    0.7% (  -5% -    7%) 0.472
                   OrHighNotHigh      840.71      (4.9%)      846.83      (4.1%)    0.7% (  -7% -   10%) 0.611
                         MedTerm     1659.04      (5.1%)     1672.60      (3.9%)    0.8% (  -7% -   10%) 0.566
                      AndHighMed       58.10      (4.1%)       58.60      (3.8%)    0.9% (  -6% -    9%) 0.491
                       LowPhrase      523.50      (3.0%)      528.02      (2.2%)    0.9% (  -4% -    6%) 0.300
                       OrHighLow      590.49      (4.9%)      595.81      (4.1%)    0.9% (  -7% -   10%) 0.529
                    HighSpanNear        0.77      (5.9%)        0.78      (5.8%)    1.0% ( -10% -   13%) 0.574
                    OrNotHighMed     1071.83      (3.9%)     1084.40      (3.7%)    1.2% (  -6% -    9%) 0.326
                     MedSpanNear       41.11      (4.4%)       41.62      (4.7%)    1.2% (  -7% -   10%) 0.386
                       OrHighMed       74.79      (5.0%)       75.73      (4.9%)    1.3% (  -8% -   11%) 0.423
                      HighPhrase      278.97      (4.2%)      282.56      (2.6%)    1.3% (  -5% -    8%) 0.248
                      OrHighHigh       44.44      (5.8%)       45.01      (5.5%)    1.3% (  -9% -   13%) 0.470
                   OrNotHighHigh     1490.57      (4.3%)     1510.81      (3.3%)    1.4% (  -5% -    9%) 0.263
                       MedPhrase       91.14      (4.3%)       92.44      (2.6%)    1.4% (  -5% -    8%) 0.203
           BrowseMonthTaxoFacets        5.34     (15.2%)        5.52     (28.6%)    3.4% ( -35% -   55%) 0.639
                         Prefix3       26.17     (21.3%)       27.12     (18.9%)    3.6% ( -30% -   55%) 0.568

Jan 17 '22 13:01 gf2121

OK, thanks, so let's keep the hack.

Jan 17 '22 13:01 uschindler

Hi @jpountz, if you agree I'd merge this in and backport!

Uwe

Jan 17 '22 13:01 uschindler

+1

Jan 18 '22 14:01 jpountz

@gf2121 Do you have an email I can use to contact you? Please send me an email at [email protected] if you don't want to leave it on GitHub.

Jan 18 '22 18:01 jpountz

Hi @jpountz. This is my email: [email protected]

Jan 19 '22 05:01 gf2121