OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Suggesters Queries are limited to shard level metrics

Open kiranprakash154 opened this issue 3 years ago • 0 comments

Is your feature request related to a problem? Please describe. We limit suggesters Search Query to the shard level metrics only and not at index level. The tf, df stats are all run against the shard level metrics and there is no aggregation done at the index level before we construct the response.

Context: The term suggesters query (documentation - scroll down to the term suggester) by default should only return suggestions if the term is not found in the index.

In simple words - If the user typed a term that is not found in the index, then probably it is a typo, so OpenSearch will suggest the next nearest words found in the index.

The suggesters queries are forced to QTF (Query then Fetch) even when explicitly run to DFS_TF -code link (https://github.com/opensearch-project/OpenSearch/blob/55eb86df0dc11dbfd11452fb837f89ab96b537be/server/src/main/java/org/opensearch/action/search/TransportSearchAction.java#L967-L976)

More details on QTF vs DFS_QTF

There is an issue filed (https://github.com/elastic/elasticsearch/issues/23838) with ES about this and I did not find an explanation why DFS_QTF is disabled.

I could not find a relevant PR in the commit history with an appropriate description about the idea behind limiting such query to only the shard level.

Describe the solution you'd like Run term suggesters the query against the index level metrics for a better UX. If not, we need to at-least clearly document this behaviour.

Additional context Reproduce the issue -

  1. Create an index with multiple shards (OS creates an index with 5 shards by default)
  2. Index a document - “league”
  3. Index another document - “leave”
  4. Route them to different shards
  5. Create the most basic Suggesters query from here and search for “league”
  6. You will see a suggestion with “leave”

We should have returned no suggestions since the word league is present in the document corpus.

kiranprakash154 avatar Sep 15 '22 19:09 kiranprakash154