lucene icon indicating copy to clipboard operation
lucene copied to clipboard

Support getMaxScore of DisjunctionSumScorer for non top level scoring clause

Open mrkm4ntr opened this issue 6 months ago • 6 comments

Description

Same as https://github.com/apache/lucene/pull/13043. WANDScorer BlockMaxConjunctionScorer doesn't work for disjunctions within disjunctions conjunctions.

mrkm4ntr avatar Feb 02 '24 15:02 mrkm4ntr

@jpountz Could you take a look?

mrkm4ntr avatar Feb 14 '24 07:02 mrkm4ntr

@jpountz Thank you for your review.

I see that we have a rewrite rule that inlines nested disjunctions, so presumably this is more about disjunctions within conjunctions than disjunctions within disjunctions?

Ah, yes. That's true.

We have tasks in nightly benchmarks that should be impacted by this change, e.g. http://people.apache.org/~mikemccand/lucenebench/AndHighOrMedMed.html, I'm curious if you tried to run them?

I'm not sure how to run these benchmarks. Is there any good instruction?

mrkm4ntr avatar Feb 16 '24 06:02 mrkm4ntr

This? https://github.com/mikemccand/luceneutil

mrkm4ntr avatar Feb 16 '24 06:02 mrkm4ntr

This indeed! It's a bit manual, but it's what powers nightly benchmarks and what we run to check the performance impact of the changes that we merge.

jpountz avatar Feb 20 '24 16:02 jpountz

@jpountz Thanks. The result is here.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                AndMedOrHighHigh       21.24      (7.1%)       20.61      (3.5%)   -3.0% ( -12% -    8%) 0.091
                 AndHighOrMedMed       30.97      (4.0%)       30.51      (3.5%)   -1.5% (  -8% -    6%) 0.216

I run this with wikimediumall and verifyCount=false. The hit counts were smaller than baseline. This means we can skip more items with this change. Is it right?

mrkm4ntr avatar Feb 28 '24 01:02 mrkm4ntr

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Mar 16 '24 00:03 github-actions[bot]

This means we can skip more items with this change.

Indeed!

jpountz avatar Mar 19 '24 16:03 jpountz

@jpountz Thank you for your review! I fixed the issue.

mrkm4ntr avatar Mar 20 '24 02:03 mrkm4ntr