lucene icon indicating copy to clipboard operation
lucene copied to clipboard

Reduce duplication in taxonomy facets; always do counts

Open stefanvodita opened this issue 1 year ago • 10 comments

Note

This is a large change, refactoring most of the taxonomy facets code and changing internal behavior, without changing the API. There are specific API changes this sets us up to do later, e.g. retrieving counts from aggregation facets.

What does this PR do well?

  1. Moves most of the responsibility from TaxonomyFacets implementations to TaxonomyFacets itself. This reduces code duplication and enables future development. Addresses genericity issue mentioned in #12553.
  2. As a consequence, it introduces sparse values to FloatTaxonomyFacets, which previously used dense values always. This issue is part of #12576.
  3. It computes counts for all taxonomy facets always, which enables us to add an API to retrieve counts for association facets in the future. Addresses #11282.
  4. As a consequence of having counts, we can check whether we encountered a label while faceting (count > 0), while previously we relied on the aggregation value to be positive. Closes #12585.
  5. It introduces the idea of doing multiple aggregations in one go, with association facets doing the aggregation they were already doing, plus a count. We can extend to an arbitrary number of aggregations, as suggested in #12546.
  6. It doesn't change the API. The only change in behavior users should notice is the fix for non-positive aggregation values, which were previously discarded.
  7. It adds tests which were missing for sparse/dense values and non-positive aggregations.

What's not ideal about this approach?

  1. ~~We could see some performance decreases. The more critical part of the work, aggregating, should be unaffected. There are a few extra method calls / dispatches / branches. Ranking and collecting results might be impacted because we are boxing / unboxing results to / from Number to avoid the primitive types.~~
  2. ~~The way the TopOrdAndNumberQueues work is a bit awkward and inefficient. It required small changes to classes outside the scope of this change. Maybe we can come up with something better.~~

What is next?

  1. I'd like to know if the approach makes sense to others.
  2. We can try running some benchmarks to see if there are any performance changes.
  3. ~~Is it important to preserve a default aggregation value of the right type in the results (i.e. -1 for int aggregations, -1f for float aggregations)? If not, we can make a small simplification to always return -1.~~

stefanvodita avatar Dec 22 '23 10:12 stefanvodita

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Jan 08 '24 12:01 github-actions[bot]

3. Is it important to preserve a default aggregation value of the right type in the results (i.e. -1 for int aggregations, -1f for float aggregations)? If not, we can make a small simplification to always return -1.

Maybe defer this to a separate issue? I can see callers expecting a consistent type, though, if you cast (float) Number where Number is an int, the cast would be fine.

mikemccand avatar Jan 08 '24 12:01 mikemccand

I found a fun HeisenBug in one of the tests. When we iterate cursors from IntFloatHashMap, the order is not deterministic. Float summation is not commutative, so the result we get by aggregating the floats in the map can be different depending on the order in which we perform the iteration. For a particular seed, running the test was producing an ordering that was not favorable, while running the debugger produced an ordering that was. The test is fixed in the latest commit and I've opened an issue to do Kahan summation over the floats instead, to reduce the error we're seeing.

For those who want to follow along, here are the exact numbers we are adding in the test in two orderings which produce different results:

class FloatSunIsNotCommutative {
    public static void main(String[] args) {
        float x = 177182.61f;
        float y = 238089.27f;
        float z = 255214.66f;
        float acc;
        
        acc = 0;
        acc += x;
        acc += y;
        acc += z;
        System.out.println(acc);
        
        acc = 0;
        acc += z;
        acc += y;
        acc += x;
        System.out.println(acc);
    }
}

stefanvodita avatar Jan 13 '24 11:01 stefanvodita

I've also run the benchmarks (python3 src/python/localrun.py -source wikimediumall). There is measurable regression in the BrowseRandomLabelTaxoFacets task, but not in other taxonomy tasks. The benchmarker also reports improvements in PKLookup, Wildcard, Respell, Fuzzy2, Fuzzy1.

The regression in the taxo task is explained in the profiler. Boxing is not cheap: 11.24% 10402M java.lang.Integer#valueOf()

@mikecan (thank you for the review!) - how should I interpret the other tasks which show a significant change? Are they just noisy?

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
     BrowseRandomLabelTaxoFacets        3.75      (1.8%)        3.53      (1.6%)   -6.0% (  -9% -   -2%) 0.000
          OrHighMedDayTaxoFacets        1.35      (7.4%)        1.31      (9.2%)   -2.7% ( -17% -   15%) 0.308
                          IntNRQ       21.64      (7.0%)       21.35      (7.4%)   -1.3% ( -14% -   14%) 0.561
                      AndHighLow      366.49     (11.2%)      362.21     (10.3%)   -1.2% ( -20% -   22%) 0.731
                    OrHighNotLow      271.40      (5.3%)      269.03      (4.5%)   -0.9% ( -10% -    9%) 0.573
                         LowTerm      604.77      (5.9%)      599.96      (4.8%)   -0.8% ( -10% -   10%) 0.640
                      TermDTSort      140.65      (2.3%)      139.58      (1.4%)   -0.8% (  -4% -    3%) 0.210
                     LowSpanNear        5.00      (2.8%)        4.96      (4.1%)   -0.7% (  -7% -    6%) 0.522
                    HighSpanNear        4.77      (3.0%)        4.74      (3.6%)   -0.7% (  -7% -    6%) 0.522
                     MedSpanNear       11.24      (2.1%)       11.18      (2.5%)   -0.6% (  -5% -    4%) 0.432
                       MedPhrase      242.61      (2.2%)      241.23      (2.0%)   -0.6% (  -4% -    3%) 0.386
                      HighPhrase       83.17      (2.1%)       82.75      (2.9%)   -0.5% (  -5% -    4%) 0.538
                   OrHighNotHigh      160.48      (4.5%)      159.81      (3.5%)   -0.4% (  -8% -    7%) 0.744
           HighTermDayOfYearSort      215.60      (2.2%)      214.81      (2.0%)   -0.4% (  -4% -    3%) 0.576
                 MedSloppyPhrase       14.07      (2.0%)       14.03      (2.4%)   -0.3% (  -4% -    4%) 0.655
                       LowPhrase       21.15      (1.3%)       21.09      (1.5%)   -0.3% (  -3% -    2%) 0.508
        AndHighHighDayTaxoFacets       10.49      (1.2%)       10.46      (1.6%)   -0.3% (  -3% -    2%) 0.547
                HighSloppyPhrase       13.80      (3.0%)       13.77      (3.1%)   -0.3% (  -6% -    5%) 0.791
                         MedTerm      479.88      (5.1%)      478.82      (4.8%)   -0.2% (  -9% -   10%) 0.887
                    OrHighNotMed      329.08      (4.5%)      328.39      (3.5%)   -0.2% (  -7% -    8%) 0.870
                        HighTerm      264.78      (5.3%)      264.27      (5.2%)   -0.2% ( -10% -   10%) 0.908
               HighTermMonthSort     1930.74      (4.4%)     1928.03      (5.2%)   -0.1% (  -9% -    9%) 0.926
                    OrNotHighMed      217.72      (2.9%)      217.51      (2.2%)   -0.1% (  -5% -    5%) 0.905
            MedTermDayTaxoFacets       16.72      (2.1%)       16.71      (1.7%)   -0.1% (  -3% -    3%) 0.892
       BrowseDayOfYearSSDVFacets        4.12      (2.7%)        4.11      (2.9%)   -0.1% (  -5% -    5%) 0.931
            BrowseDateTaxoFacets        4.68      (5.1%)        4.67      (4.6%)   -0.1% (  -9% -   10%) 0.970
                   OrNotHighHigh      231.09      (4.5%)      230.99      (3.5%)   -0.0% (  -7% -    8%) 0.975
         AndHighMedDayTaxoFacets       16.88      (1.1%)       16.88      (1.5%)   -0.0% (  -2% -    2%) 0.963
       BrowseDayOfYearTaxoFacets        4.76      (5.2%)        4.76      (4.6%)    0.0% (  -9% -   10%) 1.000
                    OrNotHighLow      464.54      (2.6%)      464.56      (2.3%)    0.0% (  -4% -    5%) 0.995
            HighIntervalsOrdered        1.81      (4.6%)        1.81      (5.0%)    0.0% (  -9% -   10%) 0.990
            HighTermTitleBDVSort        5.39      (4.8%)        5.40      (4.4%)    0.1% (  -8% -    9%) 0.968
           BrowseMonthSSDVFacets        4.40      (2.6%)        4.40      (2.6%)    0.1% (  -4% -    5%) 0.873
             MedIntervalsOrdered        1.84      (5.5%)        1.84      (5.8%)    0.2% ( -10% -   12%) 0.918
             LowIntervalsOrdered       32.12      (5.4%)       32.18      (5.6%)    0.2% ( -10% -   11%) 0.913
                       OrHighMed       67.77      (3.1%)       67.97      (3.4%)    0.3% (  -5% -    6%) 0.779
     BrowseRandomLabelSSDVFacets        2.89      (2.0%)        2.90      (1.4%)    0.3% (  -3% -    3%) 0.569
           BrowseMonthTaxoFacets        9.36     (10.9%)        9.40     (10.4%)    0.4% ( -18% -   24%) 0.896
               HighTermTitleSort      132.89      (1.9%)      133.56      (3.9%)    0.5% (  -5% -    6%) 0.600
                      OrHighHigh       20.24      (3.5%)       20.37      (3.9%)    0.6% (  -6% -    8%) 0.608
                      AndHighMed       81.65      (8.6%)       82.65      (9.8%)    1.2% ( -15% -   21%) 0.676
                 LowSloppyPhrase        4.92      (5.9%)        5.01      (6.4%)    1.6% ( -10% -   14%) 0.397
            BrowseDateSSDVFacets        1.20     (11.5%)        1.22      (9.1%)    2.1% ( -16% -   25%) 0.529
                         Prefix3      138.46      (4.9%)      141.54      (4.5%)    2.2% (  -6% -   12%) 0.138
                       OrHighLow      167.60      (7.5%)      171.65      (4.2%)    2.4% (  -8% -   15%) 0.211
                        PKLookup      169.39      (4.5%)      174.22      (4.5%)    2.9% (  -5% -   12%) 0.043
                     AndHighHigh       31.23      (9.5%)       32.15     (12.4%)    2.9% ( -17% -   27%) 0.399
                        Wildcard       66.79      (3.4%)       69.28      (3.6%)    3.7% (  -3% -   11%) 0.001
                         Respell       48.03      (2.0%)       50.35      (2.3%)    4.8% (   0% -    9%) 0.000
                          Fuzzy2       68.13      (1.3%)       71.67      (1.4%)    5.2% (   2% -    7%) 0.000
                          Fuzzy1       74.70      (1.5%)       79.47      (1.8%)    6.4% (   3% -    9%) 0.000

stefanvodita avatar Jan 13 '24 11:01 stefanvodita

I found a fun HeisenBug in one of the tests.

Oh the joys of floating point math.

For those who want to follow along, here are the exact numbers we are adding in the test in two orderings which produce different results:

Thank you for diving deep here and making such a simple reproduction.

how should I interpret the other tasks which show a significant change? Are they just noisy?

Good question -- it makes no sense that e.g. Respell/Fuzzy1/2 got faster with this change, though the benchy seems to think it is significant (p=0.000). I'm not sure what to make of it!

mikemccand avatar Jan 13 '24 13:01 mikemccand

The regression in the taxo task is explained in the profiler. Boxing is not cheap: 11.24% 10402M java.lang.Integer#valueOf()

Hmm this is sort of spooky -- should we aim to keep the specialization somehow (avoid the boxing)? Is there a middle ground where we can avoid the boxing but still remove much of / some of this duplicated code? Java is annoying sometimes :)

mikemccand avatar Jan 13 '24 13:01 mikemccand

What I've done is I've only taken advantage of the boxing for genericity when collecting results getTop... and not use it while performing the aggregations themselves. Most of the taxonomy tasks are not showing a significant performance change. I wonder if the one that has slowed down spends more time collecting the aggregation values than calculating them.

stefanvodita avatar Jan 14 '24 06:01 stefanvodita

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Feb 02 '24 00:02 github-actions[bot]

Thank you all for reviewing! I confirmed that the performance impact was from result collection, not from the aggregations themselves, and I've managed to claw back the performance hit. Most of the improvement comes from the changes to getTopChildrenForPath, which no longer usese intermediary Numbers. I've also integrated the performance-related suggestions from @epotyom (thank you for those!). I'll address the rest of the comments too, just wanted to get this out while it's fresh to see if you all have more feedback on the performance front.

python3 src/python/localrun.py -source wikimediumall

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
            BrowseDateSSDVFacets        1.24      (6.6%)        1.21      (9.6%)   -2.5% ( -17% -   14%) 0.334
     BrowseRandomLabelTaxoFacets        3.76      (3.7%)        3.69      (3.5%)   -1.8% (  -8% -    5%) 0.120
                       MedPhrase       11.46      (2.8%)       11.30      (2.6%)   -1.3% (  -6% -    4%) 0.112
               HighTermMonthSort     2290.51      (4.4%)     2262.12      (4.2%)   -1.2% (  -9% -    7%) 0.360
                    OrHighNotMed      327.20      (3.3%)      323.36      (3.2%)   -1.2% (  -7% -    5%) 0.252
                    OrHighNotLow      318.99      (3.7%)      315.45      (4.2%)   -1.1% (  -8% -    7%) 0.377
                       LowPhrase        4.74      (3.1%)        4.69      (3.0%)   -1.0% (  -6% -    5%) 0.310
                   OrNotHighHigh      244.33      (3.1%)      242.52      (3.0%)   -0.7% (  -6% -    5%) 0.443
                   OrHighNotHigh      227.54      (2.9%)      225.86      (3.2%)   -0.7% (  -6% -    5%) 0.438
                    OrNotHighMed      333.78      (2.6%)      331.35      (2.8%)   -0.7% (  -5% -    4%) 0.391
                      HighPhrase       70.04      (3.2%)       69.53      (3.3%)   -0.7% (  -6% -    5%) 0.478
                     AndHighHigh       23.27      (7.9%)       23.11      (7.1%)   -0.7% ( -14% -   15%) 0.777
                        Wildcard       51.02      (4.3%)       50.71      (4.2%)   -0.6% (  -8% -    8%) 0.652
                     MedSpanNear       29.20      (3.0%)       29.05      (2.5%)   -0.5% (  -5% -    5%) 0.561
                        HighTerm      475.59      (4.1%)      473.22      (4.7%)   -0.5% (  -8% -    8%) 0.721
                        PKLookup      176.36      (3.0%)      175.50      (2.7%)   -0.5% (  -6% -    5%) 0.589
                    HighSpanNear       10.52      (2.7%)       10.47      (2.2%)   -0.4% (  -5% -    4%) 0.612
                         MedTerm      470.14      (4.4%)      468.33      (5.4%)   -0.4% (  -9% -    9%) 0.804
       BrowseDayOfYearSSDVFacets        4.08      (3.9%)        4.06      (4.2%)   -0.4% (  -8% -    8%) 0.775
                    OrNotHighLow      322.80      (2.9%)      321.71      (2.4%)   -0.3% (  -5% -    5%) 0.692
            HighIntervalsOrdered        3.60      (4.8%)        3.59      (4.8%)   -0.3% (  -9% -    9%) 0.868
                      AndHighMed       83.14      (3.5%)       82.93      (3.9%)   -0.2% (  -7% -    7%) 0.833
       BrowseDayOfYearTaxoFacets        4.69      (4.5%)        4.68      (4.4%)   -0.2% (  -8% -    9%) 0.902
            BrowseDateTaxoFacets        4.61      (4.5%)        4.60      (4.3%)   -0.1% (  -8% -    9%) 0.937
                         Respell       53.50      (2.2%)       53.46      (1.8%)   -0.1% (  -3% -    4%) 0.902
         AndHighMedDayTaxoFacets       43.57      (1.5%)       43.54      (1.6%)   -0.1% (  -3% -    3%) 0.891
                          Fuzzy1       66.17      (2.4%)       66.20      (2.0%)    0.0% (  -4% -    4%) 0.951
                      AndHighLow      525.57      (2.6%)      525.90      (4.2%)    0.1% (  -6% -    7%) 0.955
                       OrHighMed       76.00      (3.2%)       76.05      (3.9%)    0.1% (  -6% -    7%) 0.953
            HighTermTitleBDVSort        6.93      (7.3%)        6.94      (6.8%)    0.2% ( -13% -   15%) 0.943
             MedIntervalsOrdered        2.77      (3.6%)        2.78      (3.2%)    0.2% (  -6% -    7%) 0.883
                          Fuzzy2       43.83      (1.9%)       43.90      (1.7%)    0.2% (  -3% -    3%) 0.770
                     LowSpanNear        6.13      (2.1%)        6.14      (1.9%)    0.2% (  -3% -    4%) 0.785
                HighSloppyPhrase        5.52      (3.4%)        5.53      (3.7%)    0.2% (  -6% -    7%) 0.851
           BrowseMonthSSDVFacets        4.34      (5.1%)        4.35      (4.7%)    0.2% (  -9% -   10%) 0.891
                         Prefix3       68.56      (4.6%)       68.70      (6.0%)    0.2% (  -9% -   11%) 0.899
             LowIntervalsOrdered       18.33      (2.8%)       18.38      (2.5%)    0.3% (  -4% -    5%) 0.737
                 LowSloppyPhrase       20.67      (2.2%)       20.73      (1.9%)    0.3% (  -3% -    4%) 0.627
        AndHighHighDayTaxoFacets        7.57      (2.3%)        7.59      (2.5%)    0.3% (  -4% -    5%) 0.669
           HighTermDayOfYearSort      206.91      (2.9%)      207.68      (2.6%)    0.4% (  -5% -    6%) 0.670
               HighTermTitleSort      140.79      (1.6%)      141.32      (2.0%)    0.4% (  -3% -    3%) 0.508
                         LowTerm      438.67      (7.1%)      441.44      (7.9%)    0.6% ( -13% -   16%) 0.790
                 MedSloppyPhrase       21.78      (3.1%)       21.95      (3.4%)    0.8% (  -5% -    7%) 0.454
            MedTermDayTaxoFacets       21.51      (2.2%)       21.71      (1.6%)    0.9% (  -2% -    4%) 0.122
                      TermDTSort      118.13      (3.0%)      119.30      (3.4%)    1.0% (  -5% -    7%) 0.329
           BrowseMonthTaxoFacets        9.58      (8.6%)        9.68      (8.8%)    1.1% ( -14% -   20%) 0.691
     BrowseRandomLabelSSDVFacets        2.88      (2.3%)        2.91      (1.8%)    1.1% (  -2% -    5%) 0.093
                      OrHighHigh       33.81      (7.6%)       34.24      (8.4%)    1.3% ( -13% -   18%) 0.618
                       OrHighLow      319.44      (6.2%)      323.88      (3.9%)    1.4% (  -8% -   12%) 0.393
                          IntNRQ       27.52      (5.2%)       27.96      (5.9%)    1.6% (  -8% -   13%) 0.360
          OrHighMedDayTaxoFacets        2.83      (3.3%)        2.88      (5.2%)    1.6% (  -6% -   10%) 0.243

stefanvodita avatar Mar 09 '24 23:03 stefanvodita

@gsmiller - I know you may not have time to review, but I want to at least notify you, since this is a big change and you've been very invovled in this area of the code.

stefanvodita avatar Mar 14 '24 13:03 stefanvodita

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Mar 29 '24 00:03 github-actions[bot]

Hi reviewers! This PR has become stale. Could anyone have a look at it? It has several nice improvements for taxonomy facets, with no API changes, and it sets us up to launch new features in a future release: multiple aggregations in one go and retrieving counts with aggregation facets.

stefanvodita avatar Mar 29 '24 07:03 stefanvodita

Thank you for reviewing @mikemccand! I had to rebase after #12966. I'll push tomorrow maybe if there are no objections.

stefanvodita avatar Apr 04 '24 19:04 stefanvodita

I did another benchmark run after the rebase just to make sure I haven't broken anything when integrating the split taxo arrays change. I see no significant changes.

python3 src/python/localrun.py -source wikimediumall

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           BrowseMonthTaxoFacets        8.68      (8.6%)        8.41      (8.6%)   -3.1% ( -18% -   15%) 0.257
                      OrHighHigh       24.38      (4.8%)       24.09      (4.9%)   -1.2% ( -10% -    8%) 0.424
                     AndHighHigh       26.10      (4.6%)       25.80      (2.2%)   -1.1% (  -7% -    5%) 0.315
                        HighTerm      254.91      (7.0%)      252.20      (5.9%)   -1.1% ( -13% -   12%) 0.604
           HighTermDayOfYearSort      307.54      (2.0%)      305.21      (2.1%)   -0.8% (  -4% -    3%) 0.249
                    OrNotHighLow      506.28      (2.2%)      502.52      (2.6%)   -0.7% (  -5% -    4%) 0.327
                         LowTerm      497.25      (6.3%)      493.71      (5.7%)   -0.7% ( -11% -   12%) 0.709
                       OrHighMed      102.21      (3.8%)      101.52      (4.2%)   -0.7% (  -8% -    7%) 0.589
                         MedTerm      505.87      (6.8%)      502.44      (5.9%)   -0.7% ( -12% -   12%) 0.737
                      TermDTSort      130.10      (2.4%)      129.27      (2.0%)   -0.6% (  -4% -    3%) 0.359
                    OrHighNotLow      420.65      (3.9%)      418.28      (3.8%)   -0.6% (  -7% -    7%) 0.644
                      AndHighMed       89.03      (2.4%)       88.53      (1.4%)   -0.6% (  -4% -    3%) 0.365
     BrowseRandomLabelTaxoFacets        3.72      (1.8%)        3.70      (1.4%)   -0.5% (  -3% -    2%) 0.303
            HighTermTitleBDVSort       10.39      (4.7%)       10.34      (4.4%)   -0.4% (  -9% -    9%) 0.775
                         Prefix3      131.17      (2.0%)      130.64      (3.3%)   -0.4% (  -5% -    5%) 0.645
               HighTermTitleSort      155.59      (2.2%)      155.00      (2.2%)   -0.4% (  -4% -    4%) 0.590
          OrHighMedDayTaxoFacets        4.50      (5.4%)        4.49      (5.5%)   -0.4% ( -10% -   11%) 0.825
         AndHighMedDayTaxoFacets       17.89      (1.9%)       17.85      (1.5%)   -0.3% (  -3% -    3%) 0.636
            BrowseDateTaxoFacets        4.57      (1.8%)        4.56      (1.5%)   -0.3% (  -3% -    3%) 0.639
                      AndHighLow      677.34      (2.6%)      675.67      (1.8%)   -0.2% (  -4% -    4%) 0.729
                    OrHighNotMed      349.74      (3.7%)      348.93      (2.8%)   -0.2% (  -6% -    6%) 0.823
                   OrHighNotHigh      321.44      (3.1%)      320.71      (3.0%)   -0.2% (  -6% -    6%) 0.815
                   OrNotHighHigh      229.84      (2.9%)      229.33      (2.7%)   -0.2% (  -5% -    5%) 0.805
       BrowseDayOfYearTaxoFacets        4.63      (1.7%)        4.62      (1.5%)   -0.2% (  -3% -    3%) 0.675
                       OrHighLow      377.28      (1.3%)      376.48      (1.3%)   -0.2% (  -2% -    2%) 0.601
                       MedPhrase      447.55      (2.2%)      446.61      (2.6%)   -0.2% (  -4% -    4%) 0.781
        AndHighHighDayTaxoFacets        2.48      (3.9%)        2.47      (2.7%)   -0.2% (  -6% -    6%) 0.882
                    HighSpanNear        2.84      (2.2%)        2.84      (2.0%)   -0.1% (  -4% -    4%) 0.835
                        Wildcard      294.36      (2.4%)      293.99      (2.8%)   -0.1% (  -5% -    5%) 0.879
                          Fuzzy2       61.91      (1.2%)       61.85      (1.3%)   -0.1% (  -2% -    2%) 0.814
                     LowSpanNear       36.58      (1.9%)       36.56      (1.8%)   -0.1% (  -3% -    3%) 0.923
                       LowPhrase       41.87      (1.2%)       41.85      (1.6%)   -0.0% (  -2% -    2%) 0.925
            MedTermDayTaxoFacets       23.10      (2.5%)       23.10      (2.5%)    0.0% (  -4% -    5%) 0.991
                          Fuzzy1       88.20      (0.9%)       88.23      (1.3%)    0.0% (  -2% -    2%) 0.935
                         Respell       46.76      (1.8%)       46.77      (1.8%)    0.0% (  -3% -    3%) 0.950
                    OrNotHighMed      325.18      (2.3%)      325.71      (2.0%)    0.2% (  -4% -    4%) 0.811
                     MedSpanNear        6.23      (4.0%)        6.24      (3.8%)    0.2% (  -7% -    8%) 0.846
                      HighPhrase       20.42      (1.9%)       20.47      (2.8%)    0.3% (  -4% -    5%) 0.737
            HighIntervalsOrdered        9.90      (4.4%)        9.94      (2.9%)    0.4% (  -6% -    8%) 0.763
             LowIntervalsOrdered       14.11      (4.2%)       14.17      (2.4%)    0.4% (  -5% -    7%) 0.698
           BrowseMonthSSDVFacets        4.15      (1.5%)        4.17      (2.1%)    0.4% (  -3% -    4%) 0.438
                        PKLookup      190.68      (1.8%)      191.62      (1.7%)    0.5% (  -2% -    4%) 0.381
             MedIntervalsOrdered        4.54      (4.3%)        4.57      (2.9%)    0.5% (  -6% -    8%) 0.649
                HighSloppyPhrase       14.51      (2.0%)       14.62      (2.1%)    0.7% (  -3% -    4%) 0.243
     BrowseRandomLabelSSDVFacets        2.83      (6.1%)        2.85      (5.7%)    0.8% ( -10% -   13%) 0.674
                 LowSloppyPhrase       13.09      (2.1%)       13.20      (2.4%)    0.8% (  -3% -    5%) 0.231
               HighTermMonthSort     2155.96      (3.5%)     2177.02      (3.6%)    1.0% (  -5% -    8%) 0.382
       BrowseDayOfYearSSDVFacets        4.00      (2.2%)        4.05      (2.1%)    1.2% (  -3% -    5%) 0.073
                 MedSloppyPhrase       12.84      (4.2%)       13.04      (4.7%)    1.6% (  -7% -   10%) 0.260
            BrowseDateSSDVFacets        1.17      (9.3%)        1.19      (7.0%)    1.9% ( -13% -   20%) 0.458
                          IntNRQ       21.04     (26.3%)       22.13     (25.7%)    5.2% ( -37% -   77%) 0.531

stefanvodita avatar Apr 05 '24 11:04 stefanvodita

I'm finding this difficult to port to 9x because of the way the classes have diverged and I'm not sure it's worthwhile, since a lot of the benefits here are for future development and to support API changes that would go in Lucene 10. I'll move the CHANGES entries and milestones to Lucene 10 unless anyone thinks it's worth backporting.

stefanvodita avatar Apr 05 '24 13:04 stefanvodita

Now that #12408 was backported in https://github.com/apache/lucene/pull/13300 can we now backport this to 9.x? Or was it already done in an un-linked PR or so?

Remembering to backport is proving challenging and error-proned (it always has been), not just in all of us consistently agreeing on the criteria for backport (we should always aim to backport unless it breaks non-experimental/internal public APIs?), but also in actually remembering to do it after a PR is merged to main. I wish GH provided some stronger mechanisms for us here ...

mikemccand avatar May 10 '24 16:05 mikemccand

I was just working on it today actually and finally got it in shape: #13358. Sorry it took so long!

stefanvodita avatar May 10 '24 22:05 stefanvodita

I was skeptical this would work out at first, but I think we have a successful backport in the end, so the changes will go out with 9.11.

stefanvodita avatar May 14 '24 10:05 stefanvodita