neural-search icon indicating copy to clipboard operation
neural-search copied to clipboard

[FEATURE] Implement parallel execution of sub-queries for hybrid search

Open martin-gaievski opened this issue 2 years ago • 2 comments

Is your feature request related to a problem?

Currently individual queries of Hybrid query (aka sub-queries) are executed sequentially (Initial implementation that is done under https://github.com/opensearch-project/neural-search/issues/123). That may be sub-optimal as system may waste time on waiting.

Obvious approach for such execution is to run every sub-query execution using parallel thread and wait for results from all thread, this will be at shard level. Results then will be combined and sent to coordinator node all at once, as it's done today.

What solution would you like?

Sub-queries may be executed in parallel. Actual improvement must be verified by running benchmarks on sequential and parallel approaches.

martin-gaievski avatar Sep 04 '23 22:09 martin-gaievski

This is not going to be ready for 2.14 (cc: @VijayanB ) moving to 2.15

jmazanec15 avatar Apr 29 '24 15:04 jmazanec15

Hybrid Query Hot Spots

While closely following execution of Hybrid Query, we found out that in three places (listed below), independent sub-queries are being executed synchronously. We could introduce multi threading inside our abstraction to parallelize these execution without making changes to existing api to improve query latency.

Query re-write

Query re-write is performed as first step during search to improve the search quality. HybridQuery performs rewrite by individually calling re-write api on every query sequentially, and, this will be performed in a loop, till no-more rewrite is possible. However, query re-write doesn’t just re-writes existing query object. For example, Lucene knn, during re-write, performs approximate search and build TopDocs, whereas, other Query can just convert into internal representation of Query without performing actual lucene search.

Create Scorer by Weight for every Segments

Once Query re-write is successful, Weight is created from Query. As mentioned earlier, Weight is required to store state of the Query. For given Weight, to perform search, we need to create an instance of Scorer for every segments to iterate documents inside segments and score accordingly. HybridQueryWeight, internally , iterates weights from sub-queries sequentially, to build HybridQueryScorer, which provides an abstraction over list of QueryScorer from its sub-queries. The cost of creating Scorer instance can vary and it is depends on type of query. For example, TermWeight, before creating scorer, from list of unique terms that are created during index, it will first navigate to the term value that is specified in the query, and, then it provides posting list iterator to TermScorer. Similarly, KNNWeight before creating KNNScorer, it performs approximate/exact search for given segments, and, provides list of KnnQueryResult as iterator to KNNScorer. Hence, creating scorer from Weight can be expensive depends on number of segments, size of segment and type of Query.

Calculate Score by Scorer for every matched documents

Once HybridScorer is created, search calls collector manager to collect doc id. HybridCollectorManager, calls hybridscores method to get list of scores from every subquery by calling individual subquery scorer’s score api. Like above, here, hybridscores api calls score api sequentially from scorer’s method. The cost of this score method depends on scorer. For example, BM25Similarity scorer, calculates score when calling during score api, whereas, Lucene Knn or Native Knn just returns the score that was previously calculated during query re-write and scorer creation. However, converting hybridscore implementation from single thread to multi thread improves latency of some query where score calculation is expensive actually at the time of score api.

Tasks

  • [x] Add new thread pool for hybrid query executor
  • [x] Add parallelization to scorer
  • [x] Add parallelization to Query rewrite
  • [x] Add parallelization to calculate hybrid scores
  • [x] Merge into main
  • [x] Benchmarks - [ ] Compare performance with and without parallelization - [ ] Compare performance with and without concurrent segment search

VijayanB avatar May 08 '24 17:05 VijayanB

Closing this out since changes are merged

getsaurabh02 avatar Jun 24 '24 20:06 getsaurabh02