kibana
kibana copied to clipboard
[Metrics UI] Improve API error handling for ES queries
Currently if an API request fails this message is returned:

The above error occurred from /api/metrics/snapshot
after a 503 ES request failed due to exceeding max_buckets
.
This same error looks like this within Stack Monitoring which is the error returned by ES:
And like this in APM:
Should we improve our errors when an ES query fails so the user has more information? Using the above case as an example, the user would be able to take some meaningful action to resolve the issue on their own.
Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)
Hey @neptunian - any idea how often this might happen/what might cause it?
Trying to get a sense of how much value fixing this give (e.g. are 1% of users hitting this or is it happening every day for most users) and how simple a solution might be for it (i.e. can we provide a simple error message which explains what the user should do)?
I don't know how often it might happen. In this example, the user could set their max_buckets setting to something other than the default which could cause max_bucket errors for queries like we have in our UIs. In the SM case if the user saw the max_bucket error they could connect the dots, but in inventory they might not understand because no details are given. We could probably do a quick improvement where we show pass the ES error down to the toast and make it red instead of yellow. Looks like there are some linked issues wanting to address this holistically.
Thanks for the context @neptunian - I tried to see if there is any telemetry for these messages and there is doesn't appear to be any explicit telemetry implemented for the error toasts.
I'll prioritise this low but I think it's a good idea to have error messaging to have telemetry deployed as standard so we can understand impact.
Question : Do you know the best way to implement telemetry?
e.g. Should we speak to the team who manage error handling (e.g. someone in platform) and ask them to deploy something we can use?
Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
We're not planning this at this time.