Distributed traces across AWS services are not connected
Describe the bug We are using SQS and EventBridge to call services on AWS. However, when a service on lambda sends a message to another service on Lambda using SQS the spans are not connected. Trace on lambda becomes the parent.
Steps to reproduce Run service 1 on lambda and send SQS message that is consumed by service 2.
What did you expect to see? Service 1 should show up in the trace.
What did you see instead? Only service 2 shows up
What version of collector/language SDK version did you use? Version: latest collector and otel layers on lambda.
What language layer did you use? Nodejs
Additional context OTEL_PROPAGATORS are set to tracecontext,baggage,xray. Backend is Tempo. Is there something I am missing? I can see traceheader in the message attributes
"attributes": {
"ApproximateReceiveCount": "1",
"AWSTraceHeader": "Root=1-6808be41-c7d04735710d987017e5ec88;Parent=371ec507158784ee;Sampled=0;Lineage=240:e7ff9e16:238",
"SentTimestamp": "1745403457578",
"SenderId": "id",
"ApproximateFirstReceiveTimestamp": "1745403457579"
},
Is it because I need sqsExtractContextPropagationFromPayload enabled? How could I do that without creating my own custom instrumentation layer?
Hi @gkaskonas that does indeed not sound right. I will try to reproduce your issue this weekend when I have some more time. Do you receive any other spans for service 1 and only the sqs ones are missing?
+1 Fairly certain we're seeing the same issue in a similar setup we have, trying to auto-instrument Java Lambdas using otel java agent layer. I'd reached out to the community in this discussion post - https://github.com/open-telemetry/opentelemetry-lambda/discussions/1792
I've started looking into this. One thing I can say immediately is that sqsExtractContextPropagationFromPayload should not make a difference. This just influences how context propagation is done, and using sqs message attributes should be just fine.
I have been able to reproduce this I think. I have this small example with 2 lambda functions. 1 is invoked through an http request (through apigateway), then sends an sqs message. Then using event source mapping the other lambda is invoked whenever a message becomes available on the sqs queue (this one also make a dynamodb call but that is not relevant).
Does this match your problem?
Here's an example from tempo where I have the 2 separate traces:
1st lambda:
2nd lambda after sqs message received:
I haven't looked deeply into the instrumentation code, but am currently thinking that maybe the propagation of the context doesn't work because we are using event source mapping. Meaning in the "receiving" lambda we don't explicitly use the sqs client to consume messages, which would otherwise be instrumented and allow propagating the context... Will try to investigate this further.
Yes thats what I see
So reading this in the aws-lambda instrumentation readme: https://github.com/open-telemetry/opentelemetry-js-contrib/tree/main/plugins/node/opentelemetry-instrumentation-aws-lambda#context-propagation
It seems to me that it is actually impossible to link the spans together if we're not using X-Ray 🤔
Kind of defeats the purpose of distributed tracing in my opinion. Even with X-Ray set to PASSTHROUGH?
Agreed, going to do some more testing on this as the aws sdk instrumentation does seem to state the opposite: https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/plugins/node/opentelemetry-instrumentation-aws-sdk/doc/sqs.md#sendmessage--sendmessagebatch
OpenTelemetry trace context is injected as SQS MessageAttributes, so the service receiving the message can link cascading spans to the trace which created the message.
Thanks for your work on this @wpessers
OpenTelemetry trace context is injected as SQS MessageAttributes, so the service receiving the message can link cascading spans to the trace which created the message.
A couple of things on this part of the doc:
-
to me, this could be interpreted as, "the service receiving the message can add their own code to manually extract the trace/span values and update the context in that service". If this was the meaning, then I think we had been expecting the otel auto-instrumentation to do this trace passing automatically.
-
In the Java agent lambda layer at least, trace values are not transported across SQS MessageAttributes unless the
experimental-use-propagator-for-messagingoption is set (see this doc on that). When we tested setting this option we did see trace id passed in MessageAttributes but not in the way we expected and it still was not automatically propagated to the context in the receiving service (more details on this in the 'What We've Tried' section of my discussion post on this repo)
I haven't looked deeply into the instrumentation code, but am currently thinking that maybe the propagation of the context doesn't work because we are using event source mapping. Meaning in the "receiving" lambda we don't explicitly use the sqs client to consume messages, which would otherwise be instrumented and allow propagating the context... Will try to investigate this further.
this crossed our mind too, perhaps the auto-instrumentation isn't covering the event driven SQS lambda handler
Any progress? We are about to roll out otel across multiple teams which is tied together using Eventbridge and SQS mostly so it would be great to ge to the bottom of this
Hi @gkaskonas I'm sorry for the late reply, have been quite busy but finally getting some more free time on my hands to look into this. Thanks @nic-littlepay for your findings as well.
I have already discussed this with @serkan-ozal (who is a maintainer on the project) as well and he proposed trying to link the spans together.
The big issue here is that with how sqs batch processing works, we can never really be sure to which "parent" a lambda invocation span belongs. We can probably try to do something with span-linking, so that your receive span is not actually added to that same trace. But it will be linked, so depending on the capabilities of your observability backend you will be able to still correlate the spans.
Hi @gkaskonas @nic-littlepay,
As @wpessers mentioned that the real challenge here is more than propagating trace context over SQS messages. Let me share the example I have used while discussing with Warre:
Assume that we have the following flow: "Lambda-A => SQS queue => Lambda-B"
And we have two different invocations and traces from "Lambda-A" :
- trace-1: Lambda-A-invocation-span-1 => SQS-client-span-1 (sent message-1 to the SQS queue)
- trace-2: Lambda-A-invocation-span-2 => SQS-client-span-2 (sent message-2 to the SQS queue)
Then, "Lambda-B" is invoked/triggered with the batch of SQS messages
including both "message-1" and "message-2" in the same invocation.
And assume that we are able to extract trace context from both
"message-1" and "message-2" as "trace-1" and "trace-2" accordingly.
We also have an "Lambda-B" invocation span as root span here.
Then the question is that which one of the "trace-1" and "trace-2" will be used
as the parent trace content of the Lambda invocation as there can only be one parent trace context.
If the Lambda could be triggered with just one SQS message,
the problem will be easier to solve as we can just extract the trace context
from the SQS message attributes and use it as parent trace context of the Lambda invocation span.
But, in a single invocation, Lambda function might be triggered by multiple SQS messages from different traces.
So, I think, for the cases where there are multiple upstream traces, it makes sense to
- extract all the parent trace contexts (e.g. from all SQS messages in batch)
- start a new trace with the Lambda invocation span
- add all the extracted parent trace contexts to the current trace as span link
When there are multiple traces connected to each other over span links, showing all span in a single trace map will depend on the capabilities of the tracing backend you use.
In addition to this behavior, as an optimization (not sure though), we might also have the following logic to cover some cases: If all the the parent trace contexts belong to the same trace (e.g. Lambda is triggered with a single SQS message or all the SQS messages belong to the same trace), the new Lambda invocation span will inherit the trace id.
In summary, we have to items here:
- How to propagate (inject and extract) trace context over AWS SQS?
- What to do with batch of SQS messages coming from the same or different traces?
I understand that batch can be difficult but I am not using batch. It would be nice to support non batch at least Also what about EventBridge? It seems that traces get lost after that call
I actually don't think aws sdk instrumentation currently supports EventBridge... I suppose that would be an issue to raise in the js-contrib repo. See: https://github.com/open-telemetry/opentelemetry-js-contrib/tree/main/plugins/node/opentelemetry-instrumentation-aws-sdk
I see Eventbridge events
{
"traceId": "0fe95b12c2203673e0bc9c5a294dce17",
"spanId": "4067c1e2921f82ee",
"parentSpanId": "ca5ff6e3e9d8f38e",
"traceState": "",
"name": "EventBridge.PutEvents",
"kind": "SPAN_KIND_CLIENT",
"startTimeUnixNano": 1747836752034000000,
"endTimeUnixNano": 1747836752047908400,
"attributes": [
{
"key": "rpc.system",
"value": {
"stringValue": "aws-api"
}
},
{
"key": "rpc.method",
"value": {
"stringValue": "PutEvents"
}
},
{
"key": "rpc.service",
"value": {
"stringValue": "EventBridge"
}
},
{
"key": "aws.region",
"value": {
"stringValue": "us-east-1"
}
},
{
"key": "aws.request.id",
"value": {
"stringValue": "d68f13dd-f92b-4c46-aa75-a5c8c4622ec1"
}
},
{
"key": "http.status_code",
"value": {
"intValue": 200
}
}
],
"droppedAttributesCount": 0,
"droppedEventsCount": 0,
"droppedLinksCount": 0,
"status": {
"code": 0,
"message": ""
}
}
Is this not enough for it to carry across different services?
Yes it should be, I was a little too quick to judge there but this seems fine. We've discussed this with other maintainers of the project and will be following up soon as we will have to make some changes to support the tracing accross SQS / SNS / EB.
Thanks for the clear explanation @serkan-ozal 👍 . Totally understand how traceId propagation in batched SQS message handling is not straightforward. I've seen that some of our SQS handler Lambdas are using batches and some are setting BatchSize: 1. Also it's unclear in practice, for those with BatchSize > 1, whether there is ever actually more than one message being polled at a time - for example if those services aren't handling enough volume to write messages to the queue that fast 🤷 .
I think you're alluding to this above, but just for clarity, I feel that in finding a solution to this Github issue, we should assess the problem as two separate use cases:
- BatchSize = 1
- BatchSize > 1
We'd love to see a solution for case #1 as I think this would allow the auto-instrumentation Lambda layer to work its magic across a large percentage of our systems. The case #2 is more difficult and we can accept that tracing for these cases will not always be as complete.
Thanks to all who are contributing on solving this one 👍
In the meantime...we're using the Java Agent Lambda layer to instrument our Lambda. To support our deployments where we know batch size is 1, if we wanted to roll our own context extraction logic in the SQS message receiving Lambda, do you have any pointers for how we could best use the otel libraries to do that? (no worries if not, I appreciate this is straying slightly off issue 😄)
@nic-littlepay I may be missing something in all the different options you've tried. But have you tried using the wrapper layer and then using the corresponding wrapper script through setting the AWS_LAMBA_EXEC_WRAPPER env var as described here: https://github.com/open-telemetry/opentelemetry-lambda/blob/main/java/README.md#wrapper ?
If yes, what did that result in?
But have you tried using the wrapper layer... If yes, what did that result in?
I've tested this, and the below dropdown contains a reminder of how this was setup. The result was that trace id values were written to log messages in Lambda 1 but in Lambda 2 there were no trace id or span id values appearing in logs. This is different to when Lambda 2 is configured to use the javaagent Layer instead, for that configuration we see trace id values in Lambda 2 logs, although as we're discussing, these have not been propagated from Lambda 1 invocation.
Configuration of infra involved in the test
Both Java Lambdas use log4j2 with this configuration
...
<PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss} [%t] %-5level %logger{36} - trace_id=%X{trace_id} trace_parent=%X{trace_parent} traceparent=%X{traceparent} span_id=%X{span_id} %msg%n" />
...
Lambda 1 template - this lambda is invoked over HTTP and places a message on the queue
...
Runtime: java17
Tracing: Active
Layers:
- !Sub "arn:aws:lambda:${AWS::Region}:184161586896:layer:opentelemetry-javaagent-0_12_0:1"
Environment:
Variables:
AWS_LAMBDA_EXEC_WRAPPER: /opt/otel-handler
SQS_QUEUE_URL: !Ref SqsQueue
...
SQS Queue template - short and sweet
SqsQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: java-lambda-poc-queue
Lambda 2 template - SQS Event triggered Lambda that reads SQS messages
...
Runtime: java17
Tracing: Active
Layers:
- !Sub "arn:aws:lambda:${AWS::Region}:184161586896:layer:opentelemetry-javawrapper-0_12_0:1"
Environment:
Variables:
AWS_LAMBDA_EXEC_WRAPPER: /opt/otel-sqs-handler
Events:
SQSEvent:
Type: SQS
Properties:
Queue: !GetAtt SqsQueue.Arn
BatchSize: 1
...
Revisiting otel documentation
I keep re-reading otel documentation to see if there's something I'm missing. The below snippet seems relevant to our setup and reads like we should be expecting our current setup to work. I'm assuming that the instructions like "the name of the span MUST be < event source > process" are things that are the responsibility of the javaagent library and not something we need to be manually coding? Furthermore, the documentation seems to read as if the library does support batched SQS messages from multiple sources, handled by a Lambda that's triggered by an SQS Event. And finally, noting the Tracing: Active part of our Lambda configuration above, is this correct? As I feel this configuration is unclear from the otel docs.
(below taken from https://opentelemetry.io/docs/specs/semconv/faas/aws-lambda/#sqs-event)
For the SQS event span, if all the messages in the event have the same event source, the name of the span MUST be < event source > process. If there are multiple sources in the batch, the name MUST be multiple_sources process. The parent SHOULD be the SERVER span corresponding to the function invocation.
For every message in the event, the message system attributes (not message attributes, which are provided by the user) SHOULD be checked for the key AWSTraceHeader. If it is present, an OpenTelemetry Context SHOULD be parsed from the value of the attribute using the AWS X-Ray Propagator and added as a link to the span. This means the span may have as many links as messages in the batch.
So any steps forward for simple SQS workflow? We have a service that sends a message to a FIFO queue and another one that process it. At the moment its not connected
Hi, progress on this has been slow I know. But I've been looking into the actual sqs instrumentation code and came to the conclusion that that already works well. It is in fact trivial to use the trace context that is put into the message attributes to parse the context at your receiving end and then link back the span to the original producer span. As I said, the sqs instrumentation fully supports this. I even ran a test where my lambda does not use an event source mapping, but rather explicitly calls receive message on an sqs queue and then processes the messages in that batch. Here's my dummy implementation for that:
const response = await sqsClient.send(new ReceiveMessageCommand({
QueueUrl: process.env['LAUNCH_QUEUE_URL'],
MessageAttributeNames: ['All'],
MessageSystemAttributeNames: ['All']
}))
response.Messages.forEach((msg) => {
console.log('processing');
})
And the span link gets added correctly: https://github.com/user-attachments/assets/12757f3c-720d-4c85-8a90-46ed5cf6ec78
So it seems like my earlier assumption about this issue being related to the usage of event source mapping seems to be correct: https://github.com/open-telemetry/opentelemetry-lambda/issues/1787#issuecomment-2870092672
Next steps I'm currently thinking about contributing something to the aws-lambda instrumentation package, so that we can do the context extraction there as well when working with event source mapping. But not entirely sure how complex that would be.
Seems like aws-lambda instrumentation should process the attributes.
Yep I think so too. I am discussing this with the js maintainers and will link issue as soon as we have one.
From a NJS perspective:
I approached this from the 'other direction' in that I assembled the various OTel pieces until it worked rather than starting with the ADOT layer. In the process I learned, after much pain, that the 'suppress internal instrumentation' setting of the SDK instrumentation will prevent header context propagation (X-Amzn-Trace-Id) to other services. Unfortunately this is enabled by default as it stands. I was just looking through the code to see if I could now swap in ADOT when I spotted this and thought I'd mention.
Having this enabled combined with manual creation of spans on the consumer side—unavoidable in many cases because you may be processing batches of interleaved messages from multiple producer traces—gave me the properly linked response I was after (like so).
Unfortunately the rendering in the console is not particularly pretty for reasons, but it does at least have the desired effect.
Interesting, will consider this approach as well. Let me find out if we can work with this, and if yes if we can make the suppression of internal instrumentation more easily configurable! It would still be a nice QoL addition to the instrumentation if we could let it handle span creation for us. Otherwise you will indeed have to do that manually. As you mention, we can never know if messages in a batch will have the same producer or not.
It is in fact trivial to use the trace context that is put into the message attributes to parse the context at your receiving end and then link back the span to the original producer span.
Hey @wpessers ,
from your earlier comment,
I would actually love some guidance on how to do this in Java using the Otel Java agent. I feel like I must be missing something obvious because the code I came up with to parse and populate context on the receiving end was clunky. If you had an example that would be so helpful.
Thanks again for all your time on this 👍
Any progress on the issue?
I know I'm late on the response here but @nic-littlepay I haven't had time to deep-dive into your problem as I'm a little bit more experienced with how this works in nodejs. Have you looked into: https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/aws-lambda/aws-lambda-events-2.2/library#sqs-handler
@gkaskonas for nodejs, I opened following PR in the js-contrib: https://github.com/open-telemetry/opentelemetry-js-contrib/pull/2981 Still needs some minor changes and testing, but believe we should be able to get this merged soon