apm-server icon indicating copy to clipboard operation
apm-server copied to clipboard

Move index templates and ingest pipelines from apm package to ES

Open simitt opened this issue 2 years ago • 6 comments

Add index templates and ingest pipelines, currently defined in the apm-package, to Elasticsearch itself.

See a proof of concept at https://github.com/elastic/elasticsearch/pull/97546. Thus, by default, APM could index into a fresh Elasticsearch cluster without any checks that the templates and pipelines are set up.

simitt avatar Aug 30 '23 17:08 simitt

After https://github.com/elastic/elasticsearch/pull/97546 is merged, we'll need to follow up by extending the YAML REST tests (see https://github.com/elastic/elasticsearch/blob/d832ac65b3cd072263dc198fd211f7f87787ec9b/x-pack/plugin/apm-data/src/yamlRestTest/resources/rest-api-spec/test/10_apm.yml), and fix/enhance index templates and ingest pipelines accordingly. These tests should index some sample documents and asserting that they're processed and mapped as expected.

axw avatar Oct 23 '23 06:10 axw

With https://github.com/elastic/apm-server/pull/12066, and the changes in https://github.com/elastic/elasticsearch/pull/103032, I'm able to get all APM Server system tests to pass. There will be some differences to approvals: there are additional .text subfields, and we're not removing fields from span documents like we do in the integration's ingest pipeline. I think both of those changes are desirable.

axw avatar Dec 06 '23 05:12 axw

Before we turn the plugin on by default, we'll need to update https://www.elastic.co/guide/en/elasticsearch/reference/master/snapshots-restore-snapshot.html#restore-entire-cluster to include disabling/re-enabling the apm-data registry.

axw avatar Jan 18 '24 03:01 axw

I did some more local testing with https://github.com/elastic/apm-server/pull/12066, and found that TestJaeger fails because labels.location is matching the ecs_geo_point dynamic template in ecs@mappings: https://github.com/elastic/elasticsearch/blob/029624000b039516f60c14228492e3164bce6c61/x-pack/plugin/core/template-resources/src/main/resources/ecs%40mappings.json#L186-L196

I think those path_match rules are a bit too wide. I'll see if we can narrow them to only match geo.location.

axw avatar May 07 '24 06:05 axw

The location rule was narrowed to geo.location in https://github.com/elastic/elasticsearch/pull/108349.

Before switching to apm-data by default, we may want to get https://github.com/elastic/kibana/pull/183250 in, as we found some issues with the service map (and other script aggs) when fields are missing from mappings.

axw avatar May 14 '24 07:05 axw

I've been working in confirming index templates and pipelines are aligned between the integration and the plugin.

For ingest pipelines, I redacted a document highlighting the differences. Work is in progress to address comments & align where necessary.

For index template, I generated all final index templates through the _simulate API for both the integration and the plugin and I create a document with differences.

endorama avatar May 17 '24 16:05 endorama

@endorama I see that @axw has left some comments on the doc; what else do you need for moving forward with this?

simitt avatar May 21 '24 13:05 simitt

Other things we need to do:

  • [x] https://github.com/elastic/elasticsearch/pull/108860
  • [x] https://github.com/elastic/elasticsearch/pull/108885
  • [x] https://github.com/elastic/elasticsearch/pull/108862
  • [x] https://github.com/elastic/integrations/pull/9949
  • [x] perform some manual upgrade testing from 8.14.x to 8.15.0
  • [ ] ~update docs to remove requirement for installing the integration (e.g. at https://www.elastic.co/guide/en/observability/current/apm-getting-started-apm-server.html)~ moved to https://github.com/elastic/apm-server/issues/13237

axw avatar May 21 '24 13:05 axw

I created https://github.com/elastic/apm-server/issues/13237 to track updating the documentation, so we can track it separately from this task.

To close this what's left is running the benchmark to verify ingestion performances, which I'm running today and will report on later.

endorama avatar May 27 '24 12:05 endorama

I'm closing this issue as I'm tackling the benchmarks to confirm performance parity in https://github.com/elastic/apm-managed-service/issues/692

endorama avatar May 28 '24 12:05 endorama