powertools-lambda-typescript
powertools-lambda-typescript copied to clipboard
Feature request: sequential async processing
Use case
Sometimes I have records that I want processed one at a time, but my processor function happens to be async.
Solution/User Experience
It would be nice to request sequential processing of records with async handlers.
Alternative solutions
No response
Acknowledgment
- [X] This feature request meets Powertools for AWS Lambda (TypeScript) Tenets
- [ ] Should this be considered in other Powertools for AWS Lambda languages? i.e. Python, Java, and .NET
Future readers
Please react with 👍 and your use case to help us understand customer demand.
Hi @revmischa thank you for taking the time to open this issue.
As we discussed in Discord, I think the request is valid and I remember hearing it from other customers in the past few weeks.
I'm adding this to the backlog so that it can be picked up.
If anyone is interested in contributing, please leave a comment so we dan discuss an implementation.
I also have a use case where I have an async handler but need to process FIFO events sequentially.
I think it's a safe assumption that most every handler anyone will ever write will be async. How else will do you I/O otherwise? And what is the use of a handler that can't do I/O?
That's very fair, we are focused on releasing v2 this & next week.
After that we'll be able to reprise working on new features for the existing utilities. This is one of the issues I'd like to pick up relatively soon.
I can work on this next. I can see there is a section in the doc about Async processing,
*If your function is async returning a Promise, use BatchProcessor and processPartialResponse *
If your function is not async, use BatchProcessorSync and processPartialResponseSync
So, based on the PR description, do we now want to have the option to use BatchProcessorSync
and processPartialResponseSync
in an async
function? It would be helpful if I could have some more context. @dreamorosi, whenever you are free.
Also just curious, what is the use case for sync processing? You can't really do I/O without async right? So what use is a SQS processing function that can't do any I/O?
Hi @arnabrahman - thank you for reviving the conversation on this feature request.
When we initially ported the Batch Processing utility from the Python version of Powertools for AWS Lambda, we did so mirroring their preferred patterns: meaning we made the synchronous & sequential processor the default, and the asynchronous & parallel one the alternative one.
In hindsight, this was a mistake because - as @revmischa points out - in modern Node.js working with async/await
and promises is the de facto standard when dealing with I/O.
In the next release, and before the utility was considered generally available we corrected this and made the BatchProcessor
asynchronous by default, and made the sync one the secondary one (BatchProcessorSync
).
Currently the async processing only supports processing the items in the batch in parallel (implementation is here).
For example, today you can do this, which will call the recordHandler
on each item in the batch in parallel:
import {
BatchProcessor,
EventType,
processPartialResponse,
} from '@aws-lambda-powertools/batch';
import type { SQSRecord, SQSHandler } from 'aws-lambda';
const processor = new BatchProcessor(EventType.SQS);
const recordHandler = async (record: SQSRecord): Promise<void> => {
// ... do your async processing
};
export const handler: SQSHandler = async (event, context) =>
processPartialResponse(event, recordHandler, processor, {
context,
});
As part of this feature request we should allow customers to also use sequential processing, with a flag similar to this:
import {
BatchProcessor,
EventType,
processPartialResponse,
} from '@aws-lambda-powertools/batch';
import type { SQSRecord, SQSHandler } from 'aws-lambda';
const processor = new BatchProcessor(EventType.SQS);
const recordHandler = async (record: SQSRecord): Promise<void> => {
// ... do your async processing
};
export const handler: SQSHandler = async (event, context) =>
processPartialResponse(event, recordHandler, processor, {
context,
processInParallel: false // new flag, name to be confirmed
});
I'm not 100% sold on the name of the option being processInParallel
, but with it I think we should convey the following:
- by default, when using
processPartialResponse
&BatchProcessor
items are processed in parallel - this is to maintain backwards-compatibility (we might decide to change the default to sequential in the next major version, but that's a separate discussion) - by using this new option I'm opting out of the default behavior and instead choosing to have the utility call my
async
record handler sequentially following the order of the items as they appear in the batch.
Consequentially, regardless of the name we choose for the option, we will have to modify the process()
method in the BatchProcessor
class to check the value of the new option, and when opted-out, call & await each promise sequentially.
Regarding the BatchProcessorSync
and processPartialResponseSync
, I don't think there will have to be any changes for this new feature to be added.