apm-server icon indicating copy to clipboard operation
apm-server copied to clipboard

Grok processor slows down APM ingest pipeline

Open felixbarny opened this issue 2 years ago • 10 comments

@joegallo investigated ingest pipeline performance in the wild and found out that this processor accounts for 25% of the ingest pipeline cost:

https://github.com/elastic/apm-server/blob/b75320b4220df7aec8309fa6d62c43c29404ece6/apmpackage/cmd/genpackage/pipelines.go#L69-L78

Is this still needed and if so, can we find an alternative that's not based on grok?

felixbarny avatar Apr 18 '23 17:04 felixbarny

It'd be worth testing this as a dissect processor -- not altogether that different in terms of syntax and use, but I'd expect it to have better performance.

joegallo avatar Apr 20 '23 16:04 joegallo

Given the grok --> fail --> remove pattern, though, I bet the very best performance would be a single fail with an annoyingly complex if.

joegallo avatar Apr 20 '23 16:04 joegallo

While looking at this, we should consider whether it's worth moving to a separate version field describing the data schema version (see https://github.com/elastic/apm-server/issues/10308#issuecomment-1437724753), rather than the server version. Maybe that way we could avoid string parsing altogether.

axw avatar Apr 24 '23 04:04 axw

Is this still relevant after https://github.com/elastic/elasticsearch/pull/97546/ ? Can we just remove the pipeline once that's merged ?

kruskall avatar Aug 14 '23 01:08 kruskall

@kruskall yes, I think we would get rid of the grok processor when we move the templates to ES.

axw avatar Aug 14 '23 02:08 axw

@felixbarny

Hi Felix. Recently Joe G. was on a call with a Customer with whom you had a couple of sessions regarding APM and felt this issue was related to the performance issues and overall APM-related issues they have had. We have shared with Customer this case number. Would like to request to see when this issue could be fixed and or if there are any workarounds. Thanks in advance.

mohibrahmani avatar Sep 25 '23 22:09 mohibrahmani

We're currently working on setting up index templates directly in Elasticserach rather than via Fleet for APM (https://github.com/elastic/elasticsearch/pull/97546). After that is done, we'll not use that expensive grok processor anymore.

I don't think there are workarounds for the time being but we're actively working on this at the moment.

felixbarny avatar Sep 26 '23 06:09 felixbarny

We're currently working on setting up index templates directly in Elasticserach rather than via Fleet for APM (elastic/elasticsearch#97546). After that is done, we'll not use that expensive grok processor anymore.

I don't think there are workarounds for the time being but we're actively working on this at the moment.

@felixbarny Thank you for the great news. If possible in which version do we expect the fix? Thanks again

mohibrahmani avatar Sep 26 '23 15:09 mohibrahmani

I'll need to defer that question to @simitt and @axw who are not available this week.

felixbarny avatar Sep 26 '23 16:09 felixbarny

We are aiming for the setup to be moved to ES in 8.12, but cannot make any promises on timelines at this point.

simitt avatar Sep 28 '23 14:09 simitt

The grok processor was removed in https://github.com/elastic/integrations/pull/9185

cc @simitt I believe we can close this

kruskall avatar Mar 21 '24 20:03 kruskall