apm-server icon indicating copy to clipboard operation
apm-server copied to clipboard

How to warn users if we reject a fleet configuration update.

Open Mpdreamz opened this issue 2 years ago • 3 comments

In #8220 we are discussing ignoring bad configuration updates (specifically around invalid/empty sampling policies) which can cause APM to either go into a boot loop on cloud or effectively run in denyall mode.

Opening this issue to ensure we find a way that we don't do so silently:

  • The absolute minimum we should do is log this rejection.
  • We need a mechanism to either reject a configuration update or inform Fleet that we've only partially accepted a configuration update. (cc @elastic/fleet).
  • Explore emitting warnings from apm-server
    1. We could emit warnings in a new datastream that the @elastic/apm-ui folks could use to actively warn users of ongoing warnings.
    2. Create a dedicated warnings API in apm-server that @elastic/apm-ui could query and we can use in our diagnostics tooling.

Mpdreamz avatar Jun 01 '22 11:06 Mpdreamz

We could emit warnings in a new datastream that the https://github.com/orgs/elastic/teams/apm-ui folks could use to actively warn users of ongoing warnings

This makes sense to me. I don't know where we'd display such information. We have discussed having a "troubleshooting" page where users can see if everything is working as expected (index template setup correctly, is number of dropped spans too high, is cardinality of span.name/transaction.name too high etc)

Create a dedicated warnings API in apm-server that https://github.com/orgs/elastic/teams/apm-ui could query and we can use in our diagnostics tooling.

Kibana does not query APM Server today. I'm not sure if this is even possible.

sorenlouv avatar Jun 01 '22 12:06 sorenlouv

Kibana does not query APM Server today. I'm not sure if this is even possible.

It doesn't, total brainfart on my end :). Even if possible we shouldn't.

Mpdreamz avatar Jun 01 '22 15:06 Mpdreamz

I think we should be able to leverage the work in https://github.com/elastic/beats/issues/21413 for this, which also includes a UI component for displaying this information in the agent details of the UI, see https://github.com/elastic/security-team/issues/3494

joshdover avatar Jun 14 '22 15:06 joshdover