aws-otel-lambda icon indicating copy to clipboard operation
aws-otel-lambda copied to clipboard

Traces are not flushed to the server before the lambda terminates

Open sintexx opened this issue 7 months ago • 6 comments

Describe the bug The created traces are not flushed out to the tracing provider, before the lambda is terminated.

If a lambda is only executed a single time, the traces are not flushed to the tracing provider before the lambda is frozen again.

If the same lambda is executed multiple times within a few seconds the traces of the first invocations are transfered. The same behavior can be observed if a sleep command is included in the lambda after the last trace was closed.

This behavior leads me to the assumption that the lambda just does not have enough time after the final span is closed to transfer the information to the server (Honeycomb in my case)

Steps to reproduce I created a minimal example project to reproduce the problem (https://github.com/sintexx/aws-otel-tracing-problem-example). The project can be started with 'npm run dev'. The single test lambda can be triggered via putting anything in the SQS-Queue.

The used tracing provider is Honeycomb (https://www.honeycomb.io/). To execute the project with a different provider the getMonitoringLayer function in sst.config.ts needs to be modified, to replace the API-Token and endpoint.

What did you expect to see? The trace is uploaded even for a single lambda invocation.

What did you see instead? Only multiple invocations shortly after each other or a timeout at the end of the lambda execution result in a visible trace.

What version of collector/language SDK version did you use? aws-otel-nodejs-amd64-ver-1-17-1:1

What language layer did you use? Node.js

Additional context The project uses SST (https://sst.dev/) and not CDK directly

sintexx avatar Nov 29 '23 12:11 sintexx

I'm observing this behavior as well. There is a processor in the upstream opentelemetry-lambda repo called decoupleprocessor which provides behavior to perform exports out-of-band from a lambda invocation. However, ADOT Collector for Lambda cuts out the decoupleprocessor from its component registry, so it's unavailable for use. ADOT also cuts out the batch processor. Whatever the underlying collector is doing under the hood, it is not using a processor for this batching behavior it would seem, and thus it's not something we can control via the config.

On a related note, I think it would be wise to leave the upstream support for these components in place so users of ADOT Collector for Lambda have the option to use them.

jnicholls avatar Jan 23 '24 13:01 jnicholls

This issue is stale because it has been open 90 days with no activity. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled

github-actions[bot] avatar Apr 28 '24 20:04 github-actions[bot]

Not stale, stil relevant

Dr-Emann avatar May 01 '24 16:05 Dr-Emann

I believe if I'm reading this right, no processors are allowed?

Dr-Emann avatar May 01 '24 16:05 Dr-Emann

@Dr-Emann, I don't think that's the case. There are just no processors at this moment in the lambda layer. So I think the best solution to this problem would be having the decoupleprocessor added to the lambda layer, but it seems that won't happen on short notice. There are two alternatives

  1. Act upon the 'SIGTERM' signal in your lambda, examples can be found here: https://github.com/aws-samples/graceful-shutdown-with-aws-lambda/tree/main
  2. Build your own lambda-layer with support for the decouple processor

meijeran avatar May 02 '24 19:05 meijeran

On a related note, I think it would be wise to leave the upstream support for these components in place so users of ADOT Collector for Lambda have the option to use them.

+1 Yes please. It's been a run-around to find this discussion and why the batch/decouple processor is failing.

KimboTodd avatar May 21 '24 21:05 KimboTodd