elasticsearch
elasticsearch copied to clipboard
[ES|QL] - Using a wildcard "*" on a none existing index together with a existing index, then the user doesn't get an error from ES
Elasticsearch Version
8.13
Installed Plugins
No response
Java Version
bundled
OS Version
MacOS - latest
Problem Description
When I use FROM with an index that does not exist together with a wildcard: "*" and an index that does exist, then Elasticsearch does not come back with errors, but Kibana does.
Example:
from kibana_sample_data_logs, noneexistingindex* | limit 10
Adding recording to show-case the error.
Steps to Reproduce
See the recording to re-produce.
Basically you go use this query:
from kibana_sample_data_logs, noneexistingindex* | limit 10
https://github.com/elastic/elasticsearch/assets/108192783/d818d2f3-0c9c-4da8-931a-0268f3f053a2
,
Logs (if relevant)
No response
Pinging @elastic/es-analytical-engine (Team:Analytics)
The ask here, as I understand it, is for ES|QL to fail the entire query if an index wild card returns zero results. That has the potential to break existing queries, so I've labeled this as a breaking change issue.
The behavior can be customized by allowing the user to specify the preferred indices option. (Thanks to @astefan for pointing out). We should look into exposing this on the request side and/or command itself.
Documenting here all the current behaviours:
from kibana_sample_data_ecommerce, non-existent-index=> :x:index_not_found_exception - no such index [non-existent-index]from kibana_sample_data_ecommerce, non-existent-index*=> ✅ data coming backfrom non-existent-index*, kibana_sample_data_ecommerce=> ✅ data coming backfrom non-existent-index=> :x:Unknown index [non-existent-index]from non-existent-index*=> :x:Unknown index [non-existent-index*]from non-existent-index*, other-non-existed-index, kibana_sample_data_ecommerce=> :x:index_not_found_exception - no such index [other-non-existed-index]
I do not mind either fail or make it fail silently as long as one index is available, the problem I see here is a lack of consistency.
One way would be to expose the indices option (a setting mode within FROM?) or align the behaviour, or maybe there's another idea here. As long as we find a single rule for them it would be great.
I agree, consistency makes sense here. I would think the default alignment of throwing an error due to a non-existent index makes sense in any combination regardless of expansion, but if that default is both expensive and inconsistent with defaults outside of ES|QL I'm interested in cost of the alternative(s).
Linking here the work in progress: https://github.com/elastic/elasticsearch/pull/106636
For the record, this would make a query where the index (without wildcard) name that doesn't exist to not generate an error. For example: FROM employees, nonexistent OPTIONS "ignore_unavailable" = "true" | limit 3 will succeed and return rows from employees.
For a query like FROM nonexistent1, nonexistent2 OPTIONS "ignore_unavailable" = "true" | limit 3 we will still return "unknown index error message". I've created https://github.com/elastic/elasticsearch/issues/106805 to investigate the option of returning an empty response in this case.
Closing as fixed with https://github.com/elastic/elasticsearch/pull/106636.
This adds options to a from query so that users choose the desired behavior. Also, I don't think this is Breaking anymore, since it doesn't change any defaults.
Reopening after feature reverting in https://github.com/elastic/elasticsearch/pull/108692.
Have we thought about making ignore_unavailable and allow_no_indices parameters of the API rather than a language feature? For example a query parameter POST _query?allow_no_indices=true or part of the request body?
This has several advantages:
- Kibana can just set these parameter for the Discover requests and we mitigate the issue from the description - the user does not have to set them as part of the ES|QL query
- Since these are not part of the ES|QL language itself, if we were to add another source command the behaviour should be consistent. We do not have to worry how these options fit with the syntax of a new source command, but rather a new source command has to respect these inherited options.
@astefan @bpintea wdyt? since we removed the OPTIONS feature, how do you think we should approach this?
@bpintea wdyt? since we removed the OPTIONS feature, how do you think we should approach this?
@ioanatia, OPTIONS was removed not b/c of language considerations, but because of the used underlying functionality. So it's not about how we use these ignore_unavailable and allow_no_indices toggles (URL param or language features), but what to use instead, which isn't yet defined. Tracked here.