elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

Display error for text_expansion if the queried field does not have the right type

Open ioanatia opened this issue 2 years ago • 6 comments
trafficstars

Description

When text_expansion receives a field that does not have the rank_features type, it simply returns an empty list of results:

GET search-books/_search
{
   "query":{
      "text_expansion":{
         "wrong_inexistent_field":{
            "model_id":".elser_model_1",
            "model_text":"What wild animals live in Colorado?"
         }
      }
   }
}

response:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

The text_expansion docs clearly state that the queried field must be of rank_features type, but this is not enforced at the API level.

Other Elasticsearch queries return an error message when the queried field is missing or it does not have a compatible type.

With rank_features fields and the ELSER pipeline that users can create in Kibana in the Enterprise Search plugin, it is very easy to miss exactly what field needs to be queried, because the rank_features fields are deeply nested under ml.inference. For example, this is a working query:

GET search-books/_search
{
   "query":{
      "text_expansion":{
         "ml.inference.description_expanded.predicted_value":{
            "model_id":".elser_model_1",
            "model_text":"What wild animals live in Colorado?"
         }
      }
   }
}

ioanatia avatar Aug 01 '23 12:08 ioanatia

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine avatar Aug 01 '23 13:08 elasticsearchmachine

It seems reasonable to go with the option of triggering an error. However, I think it might be a good idea to include support for ignore_unmapped similar to how it's implemented in other queries. This way, we can address situations where certain indices in the query might not have the specified field in their mapping?

jimczi avatar Aug 07 '23 08:08 jimczi

The Rank Feature query which can also be used on a rank_features field has better behaviour.

Querying on a field mapped to a different type returns an error. In this case the field is a keyword.

GET msmarco-passage-collection/_search
{
  "query": {
    "rank_feature": {
      "field": "keyword_field",
      "boost": 0.1
    }
  }
}

"failed to create query: [rank_feature] query only works on [rank_feature] fields and features of [rank_features] fields, not [keyword]"

Querying for a field that is not mapped has the same behaviour as text_expansion, it returns 0 hits and does not complain that the field is not mapped

GET msmarco-passage-collection/_search
{
  "query": {
    "rank_feature": {
      "field": "not_mapped",
      "boost": 0.1
    }
  }
}

# returns 0 hits

I think it might be a good idea to include support for ignore_unmapped

The only reference I find to ignore_unmapped is for nested queries. Query String has a lenient option (default false).

We should definitely fix text_expansion so that it errors if the field is mapped to the wrong type and look for precedents on handling unmapped fields in queries. The current behaviour is not useful.

davidkyle avatar Aug 10 '23 10:08 davidkyle

@ioanatia The text_expansion docs mentions that the queried field must be of a sparse vector or rank features field. So, we need to verify whether the queried field is either of them.

saikatsarkar056 avatar Feb 13 '24 22:02 saikatsarkar056

Correct @saikatsarkar056 this should work for both field types and error for others.

kderusso avatar Feb 14 '24 13:02 kderusso

Created a draft PR: https://github.com/elastic/elasticsearch/pull/105581

Things to do next:

  1. If the queried field does not exist, throw an exception
  2. Add tests
  3. Add a description to the PR
  4. Refactor the code
  5. Change the doc if required

saikatsarkar056 avatar Feb 15 '24 23:02 saikatsarkar056

The PR has been merged. We can close this issue since the PR has covered the required changes.

cc: @ioanatia, @kderusso

saikatsarkar056 avatar Feb 23 '24 17:02 saikatsarkar056

@saikatsarkar056 FYI - next time if you put "resolves " in the PR it will auto close the issue 🙂

kderusso avatar Feb 23 '24 18:02 kderusso

@saikatsarkar056 FYI - next time if you put "resolves " in the PR it will auto close the issue 🙂

Can we put "resolves " as comment in the PR? Can you please provide me with an example PR?

This is something new I learned about GitHub. 😃

saikatsarkar056 avatar Feb 23 '24 18:02 saikatsarkar056