powertools-lambda-python icon indicating copy to clipboard operation
powertools-lambda-python copied to clipboard

Feature request: Support multi-process / multi-thread in Batch Processors

Open HQarroum opened this issue 1 year ago • 5 comments

Use case

Use the SQS Batch Processor to spread record handlers across multiple threads/processes to avoid main thread bottlenecks when performing CPU bound tasks on the record handler. The context is about using Lambda functions (1G+ memory) to do document processing involving CPU intensive tasks such as OCR. Spreading tasks in Lambdas supporting batching windows and partial item failures across multiple CPU cores will significantly speed up the process and we'd prefer using Lambda PowerTools for that.

Solution/User Experience

A way to use a specific implementation of a Batch Processor which provides an executor to orchestrate the execution of the record handlers across multiple threads/processes (similar to the AsyncBatchProcessor implementation). Typescript support for batch processors and this feature would of course be much appreciated.

Alternative solutions

No response

Acknowledgment

HQarroum avatar Jun 19 '23 11:06 HQarroum

Hi @HQarroum! Thank you for opening this issue.

This is a very interesting case on how to improve the BatchProcessor utility. We already have the AsyncBatchProcessor to call it and process multiple records at the same time, but having a multi-thread processor is really a game changer.

We need a few days to think about it and see if we have any blocks to implement this. Since we have an update, I'll come back here and update this issue, okay?

Thanks

leandrodamascena avatar Jun 20 '23 16:06 leandrodamascena

Sure! Many thanks @leandrodamascena for your awesome support! 👍

HQarroum avatar Jun 24 '23 00:06 HQarroum

Hi @HQarroum, wanted to mention that batch processing has been beta released for TypeScript! You can see the release notes here.

It does not have a multithread processor as you suggested here, but it supports most of the same features as the Python version, including the AsyncBatchProcessor.

erikayao93 avatar Aug 07 '23 22:08 erikayao93

Thanks @erikayao93, that's awesome 👍 ! I've implemented my own batch processor in TypeScript so far, and I'm going to replace it with the new AsyncBatchProcessor. I'll let you know if there are any issues.

HQarroum avatar Aug 08 '23 17:08 HQarroum

Great to hear that! Feel free to open issues on the TypeScript repo for bug reports or if you have ideas for new features.

erikayao93 avatar Aug 08 '23 17:08 erikayao93