tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

[feat] Implement `DisjunctionMaxQuery` and refactor `ScoreCombiner`

Open ppodolsky opened this issue 3 years ago • 0 comments

Here I reproduced ES DisMax Query

Not sure if my approach is well aligned with general design, so any suggestions to redesign it are welcomed. I've decided to reuse BooleanQuery and built DisjuctionMaxQuery on it. For the purpose I had refactored ScoreCombiner part. Now ScoreCombiner creation is done through passed lambda but not through static dispatching. It allowed to configure ScoreCombiner properties. In particular, DisjunctionMaxCombiners with various tie_breaker parameters are possible now.

ppodolsky avatar Jul 28 '22 18:07 ppodolsky

@PSeitz Are there any interest in such kind of queries?

ppodolsky avatar Aug 13 '22 16:08 ppodolsky

@ppodolsky sorry I missed that review. Yes support for such queries is definitely interesting.

Could you check the performance impact on bench_union_3_low for removing the static dispatch?

PSeitz avatar Aug 15 '22 13:08 PSeitz

@PSeitz Checked, there is no any confident difference between two branches (cut few lines below to make output clearer):

pasha@home:~/tantivy$ git checkout upstream/main
HEAD is now at 8e773ade7 Merge pull request #1444 from quickwit-oss/add-async-doc-freq
pasha@home:~/tantivy$ cargo +nightly bench -F=unstable --color=always --package tantivy --lib query::union::bench::bench_union_3_high -- --exact
   Compiling tantivy v0.18.0 (/home/pasha/tantivy)
    Finished bench [optimized] target(s) in 57.24s
     Running unittests src/lib.rs (target/release/deps/tantivy-80ba2a9ba3804a7a)
running 1 test
test query::union::bench::bench_union_3_high                                                                             ... bench:     281,312 ns/iter (+/- 9,280)
pasha@home:~/tantivy$ git checkout origin/feature/dismax
Previous HEAD position was 8e773ade7 Merge pull request #1444 from quickwit-oss/add-async-doc-freq
HEAD is now at 71041b231 [fix] Fix bench
pasha@home:~/tantivy$ cargo +nightly bench -F=unstable --color=always --package tantivy --lib query::union::bench::bench_union_3_high -- --exact
   Compiling tantivy v0.18.0 (/home/pasha/tantivy)
    Finished bench [optimized] target(s) in 50.65s
     Running unittests src/lib.rs (target/release/deps/tantivy-21841e494d11c590)
running 1 test
test query::union::bench::bench_union_3_high                                                                             ... bench:     280,469 ns/iter (+/- 22,894)

ppodolsky avatar Aug 15 '22 13:08 ppodolsky

@PSeitz Is there anything else I can do in this PR?

ppodolsky avatar Aug 19 '22 08:08 ppodolsky

Codecov Report

Merging #1428 (f507008) into main (da0f78e) will decrease coverage by 0.16%. The diff coverage is 57.24%.

@@            Coverage Diff             @@
##             main    #1428      +/-   ##
==========================================
- Coverage   94.23%   94.07%   -0.17%     
==========================================
  Files         236      238       +2     
  Lines       43890    44533     +643     
==========================================
+ Hits        41359    41893     +534     
- Misses       2531     2640     +109     
Impacted Files Coverage Δ
src/query/boolean_query/mod.rs 100.00% <ø> (ø)
src/query/disjunction_max_query.rs 0.00% <0.00%> (ø)
src/query/mod.rs 100.00% <ø> (ø)
src/query/score_combiner.rs 58.33% <0.00%> (-41.67%) :arrow_down:
src/query/union.rs 98.82% <94.87%> (+0.03%) :arrow_up:
src/query/boolean_query/block_wand.rs 96.84% <100.00%> (ø)
src/query/boolean_query/boolean_query.rs 92.76% <100.00%> (+0.19%) :arrow_up:
src/query/boolean_query/boolean_weight.rs 95.05% <100.00%> (+0.47%) :arrow_up:
src/core/searcher.rs 77.53% <0.00%> (-4.78%) :arrow_down:
src/core/inverted_index_reader.rs 72.86% <0.00%> (-4.19%) :arrow_down:
... and 19 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Aug 22 '22 13:08 codecov-commenter