powertools-lambda-python icon indicating copy to clipboard operation
powertools-lambda-python copied to clipboard

Feature request: Add support for logging the idempotency key

Open ChrisHills463 opened this issue 1 year ago • 10 comments

Use case

Sometimes there may be a bug in an application that calls an API, and to resolve the bug it is necessary to delete the cached response. At present, as there is no indication of the key, it is difficult to find when there are many records.

Solution/User Experience

I would like a hook that is called when a new idempotency record is created.

Alternative solutions

Right now I have to manually go through the records to find the right one.

Acknowledgment

ChrisHills463 avatar Nov 22 '24 08:11 ChrisHills463

I would be interested in this feature as well. The idempotency utility has another use where you can use the idempotency store as a cache. If you have an "expensive" function call, storing the output in the idempotency store is a really nice way to cache the response but there is no way to evict the item before the TTL.

An alternative to a hook on storing the item (at least for my use case) would be allowing a completely custom key so we can evict items from the idempotency store from outside of the lambda if we need.

TonySherman avatar Nov 25 '24 17:11 TonySherman

Hi there - I worked on the original code for the idempotency hook. I feel you have everything you need already, but do correct me if I am missing something or what there is does not meet your requirements.

Looking at the implementation the hook gets called following loading and processing of the idempotent record. This is a "hook" and not a "wrapper" so the behaviour is implemented as follows:

        if data_record.status == STATUS_CONSTANTS["EXPIRED"]:
            raise IdempotencyInconsistentStateError("save_inprogress and get_record return inconsistent results.")
        if data_record.status == STATUS_CONSTANTS["INPROGRESS"]:
            if data_record.in_progress_expiry_timestamp is not None and data_record.in_progress_expiry_timestamp < int(
                datetime.datetime.now().timestamp() * 1000,
            ):
                raise IdempotencyInconsistentStateError(
                    "item should have been expired in-progress because it already time-outed.",
                )
            raise IdempotencyAlreadyInProgressError(
                f"Execution already in progress with idempotency key: "
                f"{self.persistence_store.event_key_jmespath}={data_record.idempotency_key}",
            )

        response_dict = data_record.response_json_as_dict()
        serialized_response = self.output_serializer.from_dict(response_dict) if response_dict else None

        if self.config.response_hook:
            logger.debug("Response hook configured, invoking function")
            return self.config.response_hook(serialized_response, data_record)

        return serialized_response

So the response hook will only be called under the following circumstances:

data_record.status === STATUS_CONSTANTS["INPROGRESS"] - When new Idempotent record is created data_record.status === STATUS_CONSTANTS["COMPLETE"] - When a cached result is returned

Since this is a hook the exception handling in handle_for_status() takes precedence and the hook will be skipped if exceptions are raised. The hook will then run only for valid status values as outlined above.

The IdempotentDataRecord:

   Parameters
        ----------
        idempotency_key: str
            hashed representation of the idempotent data
        status: str, optional
            status of the idempotent record
        expiry_timestamp: int, optional
            time before the record should expire, in seconds
        in_progress_expiry_timestamp: int, optional
            time before the record should expire while in the INPROGRESS state, in seconds
        payload_hash: str, optional
            hashed representation of payload
        response_data: str, optional
            response data from previous executions using the record
        """
        self.idempotency_key = idempotency_key
        self.payload_hash = payload_hash
        self.expiry_timestamp = expiry_timestamp
        self.in_progress_expiry_timestamp = in_progress_expiry_timestamp
        self._status = status
        self.response_data = response_data

walmsles avatar Nov 27 '24 00:11 walmsles

you can use the idempotency store as a cache. If you have an "expensive" function call, storing the output in the idempotency store is a really nice way to cache the response but there is no way to evict the item before the TTL.

An alternative to a hook on storing the item (at least for my use case) would be allowing a completely custom key so we can evict items from the idempotency store from outside of the lambda if we need.

I am wondering whether idempotency is the right solution for this use-case? Its more a caching use-case and enabling key "munging" is dangerous due to the different ways Idempotency needs to be handled for function use-case vs Lambda Invoke use-case. I feel these use-cases and responsibilities are not something to group together.

You have the option to inherit your own IdempotencyHandler from the AWS Lambda powertools one and use it for a cache use-case. Not that I typed this I feel it will be useful for an RFC on a caching style utility which could be the core of idempotency with idempotency an inherited special case of caching - then we have something really clean and useful for all use-cases without munging up responsibilities - what do you think @TonySherman ?

walmsles avatar Nov 27 '24 00:11 walmsles

@walmsles I think you're absolutely right. The idempotency was a quick fix as a cache but might be kind of a square peg in a round hole situation. I was also chatting with @leandrodamascena a little more in detail about my use case. I think a cache utility based on the current idempotency core could be very powerful. Let me know if I can make a feature request or provide any more information that would help!

TonySherman avatar Nov 27 '24 16:11 TonySherman

I think a cache utility based on the current idempotency core could be very powerful. Let me know if I can make a feature request or provide any more information that would help!

@TonySherman let's make this an RFC and see what other customers think so we can meet all the use cases

walmsles avatar Nov 27 '24 19:11 walmsles

Putting this "on hold" until we make some progress.

leandrodamascena avatar Jan 20 '25 12:01 leandrodamascena

I must apologise - what I wrote above around the "Hook" is not correct and @ChrisHills463 issue still stands. The Idempotent Hook - is only called when an idempotent response is being sent (hook is triggered only when returning a cached response and that status will always be "COMPLETE", any other value is a potential error case or indication the transaction is still being processed).

Need to consider whether a function hook is needed for the create case OR whether Powertools could be capable of adding idempotent attributes to either the logger addition keys OR the request context details about the idempotent nature of the request.

walmsles avatar Feb 17 '25 05:02 walmsles

I had some discussion around this functionality with @leandrodamascena and we discussed a potential solution to this problem.

It's quite clear there are some cases where users need control over idempotency records. I don't know how this would work with a Redis store but for dynamo, there could be an additional configuration to store the idempotency key:

@idempotent_function(
    data_keyword_argument="my_idempotency_key",
    config=config,
    persistence_store=persistence_layer,
    store_idempotency_key=True,
)

Then the original key can be stored in dynamo (would require a GSI on that attribute). Having a key stored in plaintext would allow the record to be deleted as needed manually.

Hopefully I am explaining this right but I believe the flow would be something like this:

  1. request comes in and is stored in dynamo (with the key stored as well)
  2. another action requires the stored data to be deleted
  3. delete the stored data by the idempotency key
  4. next request will come in and there is no stored response, execution flows as normal

TonySherman avatar Jul 29 '25 18:07 TonySherman

That sounds good!

ChrisHills463 avatar Jul 30 '25 08:07 ChrisHills463

This idea will work and provide the outcome you need; however, I feel a more elegant solution is possible by providing the following two things:

  1. A real Cache utility that is actually for caching that works along similar lines as the existing persistence layer of idempotency utility (see #7069 )
  2. Add a new capability to the Powertool library by enabling the publishing of selected internal library events that customers can subscribe to their own hook functions to add custom behaviour when internal actions happen deep within the Powertools library (see #7071)

I believe item 1 will meet the requirements outlined in this feature request without necessitating customer additions or extensions to the existing Idempotency storage infrastructure and isolate the functionality to a class dedicated to caching and not idempotency.

Caching !== Idempotency (just saying).

If you feel these will better meet your requirements, go over and add a 👍 to the issues to encourage me to get on with development.

walmsles avatar Jul 30 '25 12:07 walmsles