OpenSearch
OpenSearch copied to clipboard
[BUG] default_search analyzer in index settings overrides the analyzer defined in mapping
Describe the bug When there's a default_search analyzer defined in index settings and an analyzer defined in the mapping of a field, when indexing, the analyzer in mapping is used, but when searching, the default_search analyzer will be used, so the search results are not as expected.
To Reproduce
- Create a index
PUT test
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"default_search": {
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"text": {
"type": "text",
"analyzer": "whitespace"
}
}
}
}
- Index a doc
POST test/_doc/1?refresh
{
"text": "a-11"
}
- Search the index
POST test/_search
{
"query": {
"match": {
"text": "a-11"
}
}
}
, nothing return.
Expected behavior The analyzer defined in mapping takes precedence over the default_search analyzer in settings.
Host/Environment (please complete the following information):
- OS: [e.g. iOS]
- Version [2.9]
@gaobinlong this is expected behaviour, the analyzer is indexing analyzer, the search_analyzer should be used instead in the mappings:
"mappings": {
"properties": {
"text": {
"type": "text",
"analyzer": "whitespace",
"search_analyzer": "whitespace"
}
}
}
@gaobinlong this is expected behaviour, the
analyzeris indexing analyzer, thesearch_analyzershould be used instead in the mappings:"mappings": { "properties": { "text": { "type": "text", "analyzer": "whitespace", "search_analyzer": "whitespace" } } }
I think if no search_analyzer specified, analyzer will be used at both indexing time and search time, in the above case, if no default_search defined in settings, it works well:
PUT test
{
"mappings": {
"properties": {
"text": {
"type": "text",
"analyzer": "whitespace"
}
}
}
}
POST test/_doc/1
{
"text": "a-11"
}
POST test/_search
{
"query": {
"match": {
"text": "a-11"
}
}
}
POST _analyze
{
"text":"a-11",
"analyzer": "whitespace"
}
, and if we change default_search to default in settings, it also works well, only whitespace analyzer is used at indexing time and search time:
PUT test
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"default": {
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"text": {
"type": "text",
"analyzer": "whitespace"
}
}
}
}
@gaobinlong shamelessly quoiting Elasticsearch docs [1] (that we have inherited). At search time, Elasticsearch determines which analyzer to use by checking the following parameters in order:
-
The analyzer parameter in the search query. See Specify the search analyzer for a query.
-
The search_analyzer mapping parameter for the field. See Specify the search analyzer for a field.
-
The analysis.analyzer.default_search index setting. See Specify the default search analyzer for an index.
-
The analyzer mapping parameter for the field. See Specify the analyzer for a field.
[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/specify-analyzer.html
Yeah -- while (perhaps) confusing, it's longstanding behavior that search analyzers will take precedence over the index-time analyzers.
Maybe we could define a new field-level analyzer parameter (field_analyzer?) that implicitly sets both the index and search analyzer for the field, such that it would override the index-wide default search analyzer.
In my understanding, analyzer defined in the mapping of the field is already field-level, I don't know why default_search analyzer will override the implicit search analyzer defined in mapping, and I see this in the document of ES: Unless overridden with the search_analyzer mapping parameter, this analyzer is used for both index and search analysis.
Close this issue as we didn't reach consensus, will open a new one if users complain about it.