apm-server icon indicating copy to clipboard operation
apm-server copied to clipboard

Explore dynamic mapping for app logs datastream

Open lahsivjar opened this issue 3 years ago • 2 comments

The app_logs datastream is used for mapping OTEL logs and (after PR #9068) intake v2 logs. The datastream is currently configured with dynamic: false which disables automatic detection and mapping of new fields.

We want to explore setting dynamic: runtime on the field to enable fields to be searchable by default. One of the motivation for this is to enable greater flexibility for the logs intake support via APM-Server as ECS logging library may add arbitrary key/value pairs to the document root.

Related issues

#8757 (relevant comment)

lahsivjar avatar Sep 12 '22 05:09 lahsivjar

While we'd like to move forward with allowing storing arbitrary key/value pairs added via structured logging or MDC, there are some risks associated with that which can lead to data loss. Therefore, this issue is blocked by the following prerequisites:

  • Use subobject: false mapping to lessen the risk of services having object/scalar mapping conflicts. Depends on https://github.com/elastic/elasticsearch/issues/88934
  • Use ignore_malformed to lessen the impact of services defining conflicting mappings (ignore one field instead of rejecting document). We've already added this as the default in the logs-*-* index template that's built-into Elasticsearch https://github.com/elastic/elasticsearch/pull/95329
  • Lessen the impact of hitting the field limit by not rejecting documents but ignoring additional fields (field budget instead of field limit). See also https://github.com/elastic/elasticsearch/pull/96235.
  • Probably not blocking but good to have as well: https://github.com/elastic/elasticsearch/issues/95534 to avoid losing data if there are conflicts that weren't gracefully handled by the mechanisms listed above and to make it easier to analyze the failures.

felixbarny avatar Jul 24 '23 09:07 felixbarny