opentelemetry-js-contrib
opentelemetry-js-contrib copied to clipboard
feat: adding hook for adding span links to a lambda instrumentation span
Which problem is this PR solving?
- With the aws lambda instrumentation, when wrapping a lambda handler adds the ability to add span links. This is useful for batch processing where a message can come from different parent spans.
Short description of the changes
- Allow links to be added to wrapped lambda spans
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: peterabbott / name: Peter Abbott (b5b2d65b8f6e673bceec3797058304706c849188, 5a9f3cdca89cdd662586fc32e5c80c0be8f17c6d, c7c211b88079865f56f30e2a39928b807d8a9e6e)
@dyladan ping as I can see you did the last release of this module. Is there way to get eyes on a PR? Any reckons / comments here regarding adding links to an instrumented lambda root span?
Codecov Report
Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
Project coverage is 91.01%. Comparing base (
cf034c8) to head (c7c211b). Report is 243 commits behind head on main.
| Files | Patch % | Lines |
|---|---|---|
| ...-instrumentation-aws-lambda/src/instrumentation.ts | 50.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #1922 +/- ##
==========================================
- Coverage 91.02% 91.01% -0.02%
==========================================
Files 146 146
Lines 7478 7480 +2
Branches 1497 1499 +2
==========================================
+ Hits 6807 6808 +1
- Misses 671 672 +1
| Files | Coverage Δ | |
|---|---|---|
| ...-instrumentation-aws-lambda/src/instrumentation.ts | 93.18% <50.00%> (-0.50%) |
:arrow_down: |
Sorry for the delay @peterabbott ! We are catching up on PRs. Generally this seems like a reasonable change, though we’re wondering if this is something generic that doesn't have to be configured multiple times by people with a callback, and instead can be implemented into the instrumentation for everyone to benefit? It may help if you provide the specific usage of this as an example of how it will be used. Also are there tests that can be added to verify this behavior?
As a note, we are also reaching out to get some input from AWS folks and unsure about their response to this as codeowners.
@JamieDanielson thanks for the update.
The scenario we have is where a lambda uses an SQS event source mapping where the batch size > 1. The messages on the queue may have come from different parent traces. The current instrumentation looks at the Message for a parent trace id. The parent ID is saved on each message in the batch so when there is more than one unique parent trace id we lose the ability to link to a parent ID with respect to the current trace happening for the lambda invocation.
Using links was going to be our way to associate a message to a parent transaciton in these cases. Using the instrumentation approach is nice as it allows us to separate the code from tracing. Been a few issues with bugs in some of the contributed instrumentatino classes, with the isolated approach of Opentelemetry means it is easy to switch on/off.
On thought is to move the trace code to the handling of a individual message, which in the long run may be better, but we do miss a little bit of execution from when the message is handled and the splitting it out. There are some scenarios where we pre-processing message batches before the main execution so in those scenarios links become more useful.
Hope that makes sense?
I'll try to write some tests for what I am trying to describe
@peterabbott can you please share the implementation of the hook that you plan to use in your lambda? is it something specific to your setup or is it a generic logic that is always expected to work?
Since this behavior is part of semantic conventions, I wonder if we should always populate the links in the instrumentation for all users for SQS like aws-sdk instrumentation does, instead of asking to implement hooks as opt-in option.
Using the AwsLambdaInstrumentation and the extractor hook we attempt to extract the propogated trace id from the SQS Records in the batch. This is fixed to SQS and wont apply to many calls. The snippets of relevant code
const sqsEventHandler = (records: SQSRecord[], awsEventContext: AwsContext) => {
const parsedTraces = records
.map((r: SQSRecord): Context | undefined => {
return propagation.extract(ROOT_CONTEXT, r.messageAttributes, getter)
})
.reduce<Record<string, Context>>(
(acc, ctx: Context | undefined) => {
if (!ctx) {
return acc
}
const spanCtx = trace.getSpanContext(ctx)
if (!spanCtx) {
return acc
}
// grouping by parent span trace id
return {...acc, [spanCtx.traceId]: ctx}
},
{}
)
const parentTraces = Object.values(parsedTraces)
// if only one event then we can do proper distributed tracing, otherwise a way to do links
if (parentTraces?.length === 1) {
diag.debug('single parent trace matched', {
parentTraces
})
return parentTraces[0]
}
diag.debug('no or multiple parent traces', {parentTraces})
// TODO: if > 1 add parent span links or using span link extractor suggestion from PR to add the link regardless
return ROOT_CONTEXT
}
///// --------
new AwsLambdaInstrumentation({
eventContextExtractor: (event, awsEventContext): Context => {
// Opinianted to SQS event handler
if (Array.isArray(event?.Records)) {
return sqsEventHandler(event.Records, awsEventContext)
}
// just return the context
return ROOT_CONTEXT
}
}
})
This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
@blumamir wondering if there is any movement on this?