powertools-lambda-typescript icon indicating copy to clipboard operation
powertools-lambda-typescript copied to clipboard

Feature request: correlation ID's propagation

Open saragerion opened this issue 3 years ago • 13 comments

Description of the feature request

Problem statement The use of correlation ID's can be extremely helpful for debugging purposes and help developers understanding the lifecycle of user transactions as they are being handled by different microservices within a platform. It would be good to help developers understand how they can use correlation ID's effectively and following the best practices, potentially allowing them to bring their own custom correlation ID's, and most importantly propagate correlation ID's through the different utilities to external dependencies of a service. This can be achieved through the implementation of a new dedicated utilities, new features within all utilities, adding examples and/or documentation.

For the scope of this Issue, we can identify 2 types of correlation ID's:

  1. Unique transaction ID that are set by AWS (for example X-Ray Trace Id, AWS Request Id)
  2. User-defined correlation ID's that can be stored and passed along between microservices:
  • SNS - Message attributes https://github.com/awslabs/aws-lambda-powertools-typescript/blob/main/tests/resources/events/aws/sns-notification.json#L18
  • CloudFront - Request headers https://github.com/awslabs/aws-lambda-powertools-typescript/blob/main/tests/resources/events/aws/cloudfront-modify-querystring.json#L13
  • API Gateway - Headers https://github.com/awslabs/aws-lambda-powertools-typescript/blob/main/tests/resources/events/aws/apigateway-aws-proxy.json#L21
  • Others...

Summary of the feature All utilities would be able to fetch out of the box correlation ID's coming from each different AWS service (Lambda Trigger), and propagate them to logs, metrics, traces, .... Note that this feature and functionality should not necessarily be enabled by default, but it should be possible to enable it and turn it on if developers need it. It should be also possible to define your own custom correlation ID's and be able to propagate them and use them in different utilities accordingly.

Implement this logic in all utilities. Research is needed to understand the best implementation strategy, how to not do code repetition.

Code examples

TBC. Happy to receive suggestions on this one.

Benefits for you and the wider AWS community As discussed in a past meeting with @gsingh1 @loujaybee and @simontabor, the functionality of fetching and propagating correlation ID's, including custom user-defined ones, can be useful and especially relevant for developers who operate at scale and within big organisations, where you have a high number of teams responsible for microservices communicating to each other.

Describe alternatives you've considered

None that comes into mind apart from writing the code by yourself.

Additional context See here a brief definition of a correlation ID.

Related issues, RFCs

Not at the moment.

saragerion avatar Jul 23 '21 12:07 saragerion

In the case of tracing, should this ID in the name of the segment or as an annotation of the segment?

dreamorosi avatar Jul 23 '21 13:07 dreamorosi

Thanks for asking @dreamorosi, I envision it/them as an annotation(s) of the segment

saragerion avatar Jul 26 '21 08:07 saragerion

My thoughts on the problem statement:

how they can use correlation ID's effectively and following the best practices

I don't think I'm familiar with the best practices yet. Will collect more info on the way. If you know any good links, please link them in this issue :)

potentially allowing them to bring their own custom correlation ID's

This sounds like we can start with an MVP where we generate our own correlation ID, and then expand to allow a custom ID.

achieved through the implementation of a new dedicated utilities

This sounds like we would have a package like powertools/correlation. When reading this issue I first though about some relation to the logger utility, as that's where I'd expect the correlation ID to be printed so that customers can use them. But tracing also makes sense, when you want to follow the path of a request. Metrics as well as you want to know if a particular request caused e.g. latency spikes. Looks like I'm back to agreeing that this requires a cross-utility approach.

bahrmichael avatar Jul 28 '21 12:07 bahrmichael

Updated the comment above, as it accidentally showed my comment as part of the quote.

bahrmichael avatar Aug 03 '21 14:08 bahrmichael

For metrics I think outputting the correlation ID into metadata makes sense.

For logs I'd follow the correlation example in additional-keys.ts.

bahrmichael avatar Aug 13 '21 10:08 bahrmichael

See here: https://github.com/getndazn/dazn-lambda-powertools#did-you-consider-monkey-patching-the-clients-instead https://github.com/getndazn/chaos-squirrel/blob/master/packages/attack-http-requests/src/index.ts#L25

saragerion avatar Aug 20 '21 13:08 saragerion

In our sync Lou raised the idea of monkey patching:

  • https://github.com/getndazn/dazn-lambda-powertools#did-you-consider-monkey-patching-the-clients-instead
  • https://github.com/getndazn/chaos-squirrel/blob/master/packages/attack-http-requests/src/index.ts#L25

bahrmichael avatar Aug 20 '21 13:08 bahrmichael

Thought some more about a good approach, and here's what understanding I have of a well rounded approach. This might repeat some of the initial post from @saragerion.

Opt In

Correlation IDs should require opt in. As a customer I don't want the utility to just forward headers to other places. Instead I want to explicitly name correlation IDs, or allow a default set of correlation ID names.

Examples:

  • With a middle-ware setup, I can enable the default AWS correlation Ids: .use(enableCorrelationIds({awsDefaults: true}))
  • With a middle-ware setup, I can also pick my own: .use(enableCorrelationIds({customIds: ['X_CORRELATE_ID'] }))

As a result we would need a configurable function/constructor, which accepts awsDefaults: boolean and customIds: string[].

ID Population

The function code should be able to add correlation IDs at any time during the functions request handling. Correlation IDs change from request to request, but I think are not found in function initialization.

Examples:

  • Another service calls mine, with a X_CORRELATE_ID header which my function should forward.
  • My service initiates a request chain (e.g. after being called from a cron), and generates the first X_CORRELATE_ID which it should then pass to any other services.

To achieve this I think we need some memory storage that lives outside of the function calls. From Java I know Mapped Diagnostics Context, and I'm not sure if something similar exist in Node.

The NPM package correlation-id uses AsyncLocalStorage from node core utilities.

This class is used to create asynchronous state within callbacks and promise chains. It allows storing data throughout the lifetime of a web request or any other asynchronous duration. It is similar to thread-local storage in other languages.

This seems to be exactly what I'm looking for.

To let anyone populate correlation IDs, the utility should expose methods to manage correlation IDs. That way the middle-ware can add incoming correlation headers, logging can print correlation IDs, and customers can decide to clear correlation IDs if they wish so. I will look into the approaches of the Logging and Metrics utilities, and try to follow their existing management approaches. Maybe there will also be some synergies.

Why outside the function calls?

Correlation IDs are not relevant to a function invocation, but are just passed along on the side as helpful diagnostics information. They usually don't influence functions.

Injecting the Correlation ID

At any point in a request should we be able to use the correlation IDs, e.g. for printing logs, or forwarding them to other services.

Therefore the correlation ID utility should provide a way to retrieve all available correlation IDs, based on the initial config during middle-ware or annotation based setup.

There could be a method with the following signature, which allows retrieving all correlation IDs, or a subset:

function listCorrelationIds(names?: string[]): { [key: string]: string }[]

We can then use this function in logging, metrics, monkey-patching to add more information.

bahrmichael avatar Sep 16 '21 15:09 bahrmichael

@saragerion can we implement something like this for typescript too? This is similar to how python allows for logging the event with the correlationId based a json path

import { Logger, CorrelationPaths } from "@aws-lambda-powertools/logger";
import { LambdaInterface } from '@aws-lambda-powertools/commons';

const logger = new Logger();

class Lambda implements LambdaInterface {
    @logger.injectLambdaContext(correlationIdPath=CorrelationPaths.API_GATEWAY_REST, logEvent=true)
    public async handler(_event: any, _context: any): Promise<void> {
        logger.info("This is an INFO log with some context");
    }
}

export const func = new Lambda();
export const handler = func.handler;

michaelbrewer avatar Jan 16 '22 04:01 michaelbrewer

This issue has not received a response in 2 weeks. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

github-actions[bot] avatar Mar 25 '23 00:03 github-actions[bot]

Hi everyone, if you arrive on this issue because you are interested in this feature or because you are looking to migrate from dazn-lambda-powertools we would like to hear from you!

We are still considering this feature but we need more examples and to gather requirements to build a RFC.

If you are interested, please leave a comment describing briefly your use case and how the correlation ID feature would work in terms of experience. It's fine also to point towards existing resources.

If uncomfortable with sharing your use case publicly, you can also reach out to us privately at [email protected].

dreamorosi avatar Aug 02 '23 15:08 dreamorosi

@dreamorosi my use case is function are that work under AWS EventBridge scheduler would be great if I can set custom or user define in correlation id and pass by dynamodb.

However, If you have any idea to tracking transaction across functions that were trigger by EventBridge I welcome. thanks :)

udomsak avatar Aug 27 '23 04:08 udomsak

Have you considered using the Baggage standard (or something close to it) for this implementation? https://www.w3.org/TR/baggage/

alfaproject avatar Sep 29 '23 15:09 alfaproject