lucene icon indicating copy to clipboard operation
lucene copied to clipboard

GH#11922: Allow DisjunctionDISIApproximation to short-circuit

Open gsmiller opened this issue 3 years ago • 6 comments

Description

This PR explores allowing DisjunctionDISIApproximation to short-circuit after "finding" a doc, allowing sub-iterators to only be exhaustively advanced when necessary. This could be useful for non-scoring disjunction clauses.

Note: I've only changed the approximation implementation to start. DisjunctionScorer exhaustively advances all sub-iterators in this implementation if it needs to do two-phase match confirmation (which is what we do today). Ideally, we'd only advance as necessary in this step as well.

gsmiller avatar Nov 14 '22 17:11 gsmiller

Have you checked if this actually made things faster? We're saving calls to advance in some cases but also adding conditions to some very tight loops. E.g. we recently got speedups by moving some queries from WAND to MAXSCORE even though WAND is better at saving calls to advance compared to MAXSCORE.

jpountz avatar Nov 15 '22 14:11 jpountz

Have you checked if this actually made things faster? We're saving calls to advance in some cases but also adding conditions to some very tight loops.

I'd run some of our internal benchmarks with this change, and it did show a positive impact (reduced cost and latency). I did a quick run with luceneutil benchmark tasks, and it looked pretty flat. These were both kind of hastily run though, and I was guessing that the reason we saw an improvement internally is because we don't rely on scoring, so could take advantage (while I was guessing that the luceneutil tasks do scoring, but I didn't dig in yet).

It's a really good callout about the condition check in the tight loop. That hadn't occurred to me, so thank you for mentioning it. For the common case of requiring scores, this change seems like it would just cause a net increase to cost, so I don't love that.

As a next step, I'll dig into benchmarking a bit more. If it looks like there is a clear positive impact still to the non-scoring case, I'd like to see if there's a way to avoid the added condition-checking cost for the common scoring case (since all the sub-iterators will need to get advanced anyway).

Thanks for the early feedback @jpountz !

gsmiller avatar Nov 15 '22 19:11 gsmiller

@jpountz thanks for the implementation feedback! I've updated the PR, but still plan to do more benchmarking to really understand the benefit, etc. before looking to actually merge this. I'll follow up here once I've been able to do that.

gsmiller avatar Nov 18 '22 18:11 gsmiller

@jpountz I re-ran some internal benchmarking with this change to highlight the speedup in cases where scoring isn't needed (at least some specific use-cases I'm looking at). These use-cases all involve a "disjunction filter," meaning a disjunction of terms that is used as a required clause. So something like (+ (foo:bar foo:baz foo:zed) (...)), where the foo field must take on one of the specified values to be considered a candidate match. To provide a sense of scale, on average, these filters have 40 different terms in them. Since these "filters" don't participate in scoring at all, it's a good candidate for this short-circuiting.

In these benchmarks, I'm observing a 2.3% QPS improvement, and a 3.5% avg. latency reduction (5.9% p50 reduction / 3.5% p99 reduction). So the change appears to be helping this type of situation.

As for whether-or-not this change would actually hurt other common use-cases that require scoring or second-phase checks, I re-ran luceneutil benchmarks (wikimedium10m) task and don't observe any regressions there (results below). It's possible there's a gap in our benchmarks though, and maybe there are some common use-cases not covered?

                            TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
                 MedSloppyPhrase      115.00      (4.9%)      112.91      (5.1%)   -1.8% ( -11% -    8%) 0.249
               HighTermTitleSort      242.48      (3.2%)      238.66      (4.3%)   -1.6% (  -8% -    6%) 0.189
                HighSloppyPhrase       36.47      (3.8%)       36.12      (3.9%)   -0.9% (  -8% -    7%) 0.439
                         LowTerm     1766.82      (3.4%)     1752.12      (3.3%)   -0.8% (  -7% -    6%) 0.436
                      HighPhrase      263.74      (3.4%)      261.71      (2.3%)   -0.8% (  -6% -    5%) 0.404
                       OrHighLow      796.71      (2.7%)      790.71      (2.6%)   -0.8% (  -5% -    4%) 0.367
            BrowseDateSSDVFacets        3.46      (6.4%)        3.44      (7.1%)   -0.7% ( -13% -   13%) 0.755
               HighTermMonthSort     3070.86      (4.5%)     3051.39      (3.9%)   -0.6% (  -8% -    8%) 0.635
                         Prefix3      111.76      (4.6%)      111.08      (4.2%)   -0.6% (  -8% -    8%) 0.658
                   OrNotHighHigh     1249.27      (3.4%)     1242.30      (3.8%)   -0.6% (  -7% -    6%) 0.627
           BrowseMonthTaxoFacets       35.43      (1.6%)       35.23      (2.2%)   -0.6% (  -4% -    3%) 0.367
                 LowSloppyPhrase       61.49      (2.4%)       61.18      (2.5%)   -0.5% (  -5% -    4%) 0.512
                    OrHighNotMed     1139.05      (3.9%)     1133.29      (3.6%)   -0.5% (  -7% -    7%) 0.671
     BrowseRandomLabelTaxoFacets       20.33      (4.5%)       20.23      (5.6%)   -0.5% ( -10% -   10%) 0.760
                        HighTerm     1635.45      (4.2%)     1628.28      (3.9%)   -0.4% (  -8% -    8%) 0.735
                       MedPhrase       46.41      (2.3%)       46.22      (1.6%)   -0.4% (  -4% -    3%) 0.529
                       OrHighMed      193.55      (2.8%)      192.79      (2.9%)   -0.4% (  -5% -    5%) 0.663
                   OrHighNotHigh      865.20      (3.0%)      862.38      (3.9%)   -0.3% (  -6% -    6%) 0.766
                      AndHighLow     1566.83      (2.7%)     1562.47      (2.7%)   -0.3% (  -5% -    5%) 0.745
            MedTermDayTaxoFacets       48.00      (3.5%)       47.89      (3.6%)   -0.2% (  -7% -    7%) 0.836
           HighTermDayOfYearSort      812.55      (2.8%)      811.49      (2.5%)   -0.1% (  -5% -    5%) 0.878
                         MedTerm     2390.59      (3.5%)     2387.70      (3.5%)   -0.1% (  -6% -    7%) 0.912
            BrowseDateTaxoFacets       25.05      (9.2%)       25.03      (9.3%)   -0.1% ( -16% -   20%) 0.984
           BrowseMonthSSDVFacets       16.00     (18.9%)       15.99     (19.0%)   -0.1% ( -31% -   46%) 0.992
             LowIntervalsOrdered      276.77      (3.5%)      276.66      (3.6%)   -0.0% (  -6% -    7%) 0.972
                      OrHighHigh       26.41      (4.4%)       26.40      (4.4%)   -0.0% (  -8% -    9%) 0.978
                     MedSpanNear       35.38      (1.0%)       35.38      (1.1%)   -0.0% (  -2% -    2%) 0.969
                      TermDTSort      648.08      (1.0%)      648.35      (1.3%)    0.0% (  -2% -    2%) 0.909
       BrowseDayOfYearTaxoFacets       25.12      (9.5%)       25.13      (9.5%)    0.0% ( -17% -   21%) 0.988
                    OrNotHighLow     1192.74      (3.1%)     1193.69      (2.6%)    0.1% (  -5% -    5%) 0.929
            HighIntervalsOrdered        1.79      (3.3%)        1.79      (3.3%)    0.1% (  -6% -    6%) 0.934
                         Respell       46.93      (1.3%)       46.97      (1.3%)    0.1% (  -2% -    2%) 0.825
                     LowSpanNear       11.96      (0.8%)       11.97      (0.9%)    0.1% (  -1% -    1%) 0.702
     BrowseRandomLabelSSDVFacets       10.14      (9.0%)       10.16      (9.4%)    0.1% ( -16% -   20%) 0.961
                       LowPhrase      438.44      (1.6%)      439.23      (1.4%)    0.2% (  -2% -    3%) 0.701
        AndHighHighDayTaxoFacets       28.52      (1.9%)       28.57      (1.8%)    0.2% (  -3% -    3%) 0.746
             MedIntervalsOrdered       64.28      (2.3%)       64.42      (2.3%)    0.2% (  -4% -    4%) 0.777
                    OrHighNotLow     1854.91      (2.7%)     1858.96      (3.9%)    0.2% (  -6% -    6%) 0.836
                    HighSpanNear       12.65      (1.4%)       12.69      (1.7%)    0.3% (  -2% -    3%) 0.579
          OrHighMedDayTaxoFacets       14.34      (3.0%)       14.38      (4.3%)    0.3% (  -6% -    7%) 0.809
                          Fuzzy1       81.66      (0.9%)       81.90      (1.1%)    0.3% (  -1% -    2%) 0.357
       BrowseDayOfYearSSDVFacets       13.40     (12.7%)       13.44     (12.5%)    0.3% ( -22% -   29%) 0.942
         AndHighMedDayTaxoFacets       34.33      (2.1%)       34.45      (1.7%)    0.3% (  -3% -    4%) 0.569
                          Fuzzy2       76.36      (0.8%)       76.64      (0.9%)    0.4% (  -1% -    2%) 0.187
            HighTermTitleBDVSort       12.49      (6.6%)       12.55      (6.6%)    0.4% ( -11% -   14%) 0.846
                        PKLookup      179.34      (2.2%)      180.48      (2.3%)    0.6% (  -3% -    5%) 0.364
                      AndHighMed      268.67      (3.8%)      270.48      (3.9%)    0.7% (  -6% -    8%) 0.579
                    OrNotHighMed     1006.62      (4.1%)     1013.59      (4.3%)    0.7% (  -7% -    9%) 0.604
                        Wildcard      133.41      (2.7%)      134.34      (2.7%)    0.7% (  -4% -    6%) 0.417
                     AndHighHigh       74.34      (4.7%)       74.95      (5.2%)    0.8% (  -8% -   11%) 0.608
                          IntNRQ       82.56      (3.5%)       83.70      (4.9%)    1.4% (  -6% -   10%) 0.306

Given this, I'd prefer to move forward with the change (since it doesn't seem to be doing any harm to common cases covered in luceneutil and is helping in the filtering use-cases I'm benchmarking), but I'd like your opinion. Given how central of a component this is to scoring, I want to make sure I'm not being overly biased towards the use-case I'm focused on at the cost of regressing elsewhere. Thanks again for your inputs!

gsmiller avatar Nov 29 '22 20:11 gsmiller

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Jan 08 '24 12:01 github-actions[bot]

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Jan 24 '24 00:01 github-actions[bot]