powertools-lambda-python icon indicating copy to clipboard operation
powertools-lambda-python copied to clipboard

Feature request: On batch processing, fill in processor result even if BatchProcessingError is raised

Open nico00 opened this issue 1 year ago • 5 comments

Use case

According to the documentation (https://docs.powertools.aws.dev/lambda/python/latest/utilities/batch/#partial-failure-mechanics), BatchProcessingError is raised when all records failed to be processed. In such case, processor response appears empty, as all records have been successfully processed. Having the processor response filled with the list of failed records, would help in reprocessing them.

Solution/User Experience

I suggest that the processor response be compiled before raising BatchProcessingError (class BasePartialBatchProcessor). This would give the programmer the freedom to decide what to do according to various business cases.

Current approach:

    def _clean(self):
        """
        Report messages to be deleted in case of partial failure.
        """

        if not self._has_messages_to_report():
            return

        if self._entire_batch_failed():
            raise BatchProcessingError(
                msg=f"All records failed processing. {len(self.exceptions)} individual errors logged "
                f"separately below.",
                child_exceptions=self.exceptions,
            )

        messages = self._get_messages_to_report()
        self.batch_response = {"batchItemFailures": messages}

Proposed solution:

    def _clean(self):
        """
        Report messages to be deleted in case of partial failure.
        """

        if not self._has_messages_to_report():
            return

        messages = self._get_messages_to_report()
        self.batch_response = {"batchItemFailures": messages}

        if self._entire_batch_failed():
            raise BatchProcessingError(
                msg=f"All records failed processing. {len(self.exceptions)} individual errors logged "
                f"separately below.",
                child_exceptions=self.exceptions,
            )

Alternative solutions

No response

Acknowledgment

nico00 avatar Feb 06 '24 14:02 nico00

Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link

boring-cyborg[bot] avatar Feb 06 '24 14:02 boring-cyborg[bot]

Thanks for raising this @nico00.

I guess it depends on what you're trying to do with them – Lambda will auto redrive to reprocess these messages at a service level.

sthulb avatar Feb 07 '24 10:02 sthulb

That's correct but in such case Lambda is limited to two retries, while DynamoDb stream allows up to 10,000 retries. On the other side I see no cons in filling in processor result just before raising BatchProcessingError.

nico00 avatar Feb 09 '24 15:02 nico00

hey @nico00, please allow me to ask some clarifying questions

BatchProcessingError is raised when all records failed to be processed. In such case, processor response appears empty, as all records have been successfully processed.

It's technically a Lambda invocation failure, as recommended by the Lambda team. The Lambda Poller picks up the error and considers the entire batch a failure, there is no empty response in this case.

Are you experiencing an empty response instead of a BatchProcessingError? If so, it'd be a bug/regression on our side.

This would give the programmer the freedom to decide what to do according to various business cases.

Would you be able to expand with one or more examples to help us picture this better?

I'm trying to understand whether you want to intercept a BatchProcessingError - like you can with the context manager today - or something else entirely?

Thanks a lot!

heitorlessa avatar Feb 20 '24 09:02 heitorlessa

Also, before I forget, thank you for creating a feature request :) We always appreciate hearing from customers and learning what additional use cases can be unblocked (or made easier!) for everyone

heitorlessa avatar Feb 20 '24 09:02 heitorlessa

This feature request was added in the v2.41.0 release.

Docs: https://docs.powertools.aws.dev/lambda/python/latest/utilities/batch/#working-with-full-batch-failures

Closing as completed.

leandrodamascena avatar Aug 11 '24 22:08 leandrodamascena

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

github-actions[bot] avatar Aug 11 '24 22:08 github-actions[bot]