powertools-lambda-python
powertools-lambda-python copied to clipboard
Feature request: Support multi-process / multi-thread in Batch Processors
Use case
Use the SQS Batch Processor to spread record handlers across multiple threads/processes to avoid main thread bottlenecks when performing CPU bound tasks on the record handler. The context is about using Lambda functions (1G+ memory) to do document processing involving CPU intensive tasks such as OCR. Spreading tasks in Lambdas supporting batching windows and partial item failures across multiple CPU cores will significantly speed up the process and we'd prefer using Lambda PowerTools for that.
Solution/User Experience
A way to use a specific implementation of a Batch Processor which provides an executor to orchestrate the execution of the record handlers across multiple threads/processes (similar to the AsyncBatchProcessor
implementation). Typescript support for batch processors and this feature would of course be much appreciated.
Alternative solutions
No response
Acknowledgment
- [X] This feature request meets Powertools for AWS Lambda (Python) Tenets
- [X] Should this be considered in other Powertools for AWS Lambda languages? i.e. Java, TypeScript, and .NET
Hi @HQarroum! Thank you for opening this issue.
This is a very interesting case on how to improve the BatchProcessor
utility. We already have the AsyncBatchProcessor
to call it and process multiple records at the same time, but having a multi-thread processor is really a game changer.
We need a few days to think about it and see if we have any blocks to implement this. Since we have an update, I'll come back here and update this issue, okay?
Thanks
Sure! Many thanks @leandrodamascena for your awesome support! 👍
Hi @HQarroum, wanted to mention that batch processing has been beta released for TypeScript! You can see the release notes here.
It does not have a multithread processor as you suggested here, but it supports most of the same features as the Python version, including the AsyncBatchProcessor
.
Thanks @erikayao93, that's awesome 👍 ! I've implemented my own batch processor in TypeScript so far, and I'm going to replace it with the new AsyncBatchProcessor
. I'll let you know if there are any issues.
Great to hear that! Feel free to open issues on the TypeScript repo for bug reports or if you have ideas for new features.