kibana icon indicating copy to clipboard operation
kibana copied to clipboard

[Metrics UI] /metadata should throw error if no features are found but data exists

Open miltonhultgren opened this issue 1 year ago • 4 comments

Based on a recent support escalation, we found a situation where a customer had been dropping a key field (event.dataset) which the Metrics UI uses to decide which metrics to show in the Node Details page, causing the UI to simply break but without any errors.

The flow looks like this:

  1. UI resolves source config, and checks if there is any data
  2. Assuming there is data, it proceeds to ask the /metadata endpoint for which features are "active" (for what are we collecting metrics data)
  3. The list of features is a terms aggregation over event.dataset to inform us of which metricsets are being collected and thus which metrics we should have available
  4. The list of features is used in the follow up request when asking for the specific metrics to be aggregated to be displayed on the page

In the case where event.dataset is absent, no features are reported (though in this case all the metric fields still existed). /metadata returns a 200 OK with an empty features list, which is translated to a request for no metrics, which also returns a 200 OK and the UI is simply empty.

I can see two ways to solve this, re-work the feature check to search for individual metric fields instead of the metricset. This is probably a lot of work, and might break some of the domain logic the Metrics UI tries to encapsulate. The other way is to treat no features as an error (assuming there is data), which should be a much smaller effort.

AC

  • /metadata throw an error if no features are found when there is metrics data present
  • UI reports this error, giving hints that maybe some key field is missing for this UI to work

miltonhultgren avatar Jul 06 '23 18:07 miltonhultgren

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

elasticmachine avatar Jul 06 '23 18:07 elasticmachine

If Elastic were to formulate a policy around this my suggestion would be, "Elastic Observability does not under any circumstances support dropping the event.dataset field from events."

I added this to an internal epic where we're researching how best to respond to missing data, since we should do something other than displaying a blank screen.

The flexibility of Elasticsearch makes using it a hostile environment (by that I mean we can't rely on any field really being there, even if it's in the mapping, etc.) and we have to have defensive code that responds gracefully to unforeseen circumstances. A blank screen is not acceptable but "some of these documents have no event.dataset and this UI is not going to work" is fine.

smith avatar Jul 07 '23 19:07 smith

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

elasticmachine avatar Nov 14 '23 00:11 elasticmachine

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

botelastic[bot] avatar May 12 '24 00:05 botelastic[bot]