neural-search
neural-search copied to clipboard
[META] Improve Hybrid query latency
Hybrid query has high latency comparing to other compound queries like Boolean query. Based on results collected for 2.13 and depending on the dataset and exact query it may be up to 10 times slower than Bool. Another reason for this issue is degradation in performance of hybrid query comparing to initial release e.g. in OpenSearch 2.11.
Following are goals for this work:
- bring performance of hybrid query to a level when it's comparable with bool query:
- For small datasets and sub-sets it should much Bool with deviation within 20% for p90
- For large datasets (10M+ documents) and if a sub-queries return large sub-set of documents (1M+ documents in sub-query result) hybrid query should perform no worse than 2x of Bool query
- Multiple sub-queries can add additional overhead of no more than 20% of overall query time for p90
- reach the level of performance of hybrid query released in 2.11
There were some GH issues in the past that are related to the same problem, e.g. https://github.com/opensearch-project/neural-search/issues/281. In addition to that, based on analysis of the source code and some profiling I can think of following list of items:
- don't execute TopDocsCollector core collector as it takes compute and results are ignored
- optimize plugin code for better performance: check for sub-optimal initializations, loops, type conversions etc.
- for cases when some of sub-queries are rewritten to the same lucene form - execute only one query and copy scores
Github issues for each child item:
- https://github.com/opensearch-project/neural-search/issues/279 parallel execution of sub-queries
- https://github.com/opensearch-project/neural-search/issues/705 replace streaming API calls
- https://github.com/opensearch-project/OpenSearch/issues/13170 allow empty query collector context (skip TopDocsCollector)
- https://github.com/opensearch-project/neural-search/issues/729 enable empty query collector context for hybrid query scenario
- https://github.com/opensearch-project/neural-search/issues/745 optimize the way we iterate over results and collect scores of sub queries