Zappa
Zappa copied to clipboard
Delayed asynchronous invocation using SFN
Summary
I'd like to open a discussion on the feature of delayed asynchronous task invocation as in the following example:
@task(delay_seconds=1800)
make_pie():
""" This task is invoked asynchronously 30 minutes after it is initially run. """
History
I initially created a PR on the old repo with this functionality using SQS as a task queue. See:
- https://github.com/Miserlou/Zappa/pull/1648
- https://github.com/Miserlou/Zappa/issues/1647
- https://github.com/zappa/Zappa/issues/649
- https://github.com/zappa/Zappa/issues/648
Since then we've had the code from the original PR running smoothly in a production environment. We are happy with the solution, but delaying tasks too far ahead in the future (> 1 hour), although technically possible, has a couple of drawbacks:
- Costs increase linearly with the time a task is delayed, as for each delayed invocation, an additional lambda invocation is performed every 15 minutes.
- With sufficient concurrent delayed tasks, this results in lots of concurrent lambda invocations that have no purpose other than rescheduling the task for another 15 minutes.
- When the lambda function experiences downtime, or is being throttled, tasks can accumulate in the queue, resulting in a burst of invocations when the lambda function is back online. Resulting in increased stress on the system, possibly bringing the system back down.
Proposal
Because of these drawbacks I've looked into an alternative to the original task delaying with SQS and found one in AWS step functions. This service includes the ability to delay execution using a Wait state for up to 1 year with no cost or performance drawbacks. The only drawback is that the fixed $ cost/task is lower with SQS than with SFN. This means that tasks delayed for < 15 minutes are $ cheaper using SQS than using SFN, but for all tasks delayed > 15 minutes, SFN outperforms SQS on all fronts.
I've currently implemented the basic functionality outside of Zappa for a client organization of mine to test and evaluate the solution and it currently performs admirably, without any notable drawbacks. I'm willing to perform the work to integrate it into Zappa if there is a need for this feature and if there is support from the maintainers to get it merged into the master branch.
However there are some decisions to be made before the feature can be implemented and I'd like some input from the community on this.
Should async_source
be more customizable?
Currently, Zappa allows setting a known async_source
in the settings. This is by default set to lambda
to use direct invocation, but can also be set to sns
to use that service as an intermediary. However, in most cases the to be introduced sfn
async_source
is not a good default for all async invocations, only for delayed invocations. The ideal source would be smart where it chooses to invoke either lambda or sfn based on the delay. But would we then introduce a sfn_and_lambda
async_source
? You can see where this will end.
My proposal is to allow setting the async_source
setting to an import path, which allows us to add smart implementations. This has the added benefit of users being able to bring their own implementation. The original lambda
and sns
would be deprecated, but of course still supported for backwards compatibility reasons.
{
"dev": {
..
"async_source": "zappa.async.LambdaAsync",
..
}
}
Should async_source
also manage the infrastructure?
All the infrastructure that is managed by Zappa for the async_source
is currently managed in the Zappa cli schedule and unschedule functions. I'd like to move the bulk of this functionality to the same class that is pointed to by async_source
. Again with the added benefit that it is then user customizable.
class LambdaAsync:
# ...
def schedule_infrastructure():
# Schedule or update the infrastructure.
def unschedule_infrastructure():
# Unschedule the infrastructure.
Conclusion
I think delayed asynchronous invocation is a welcome feature and I am willing to put in the work to create a PR if there is support from the community and maintainers.
I propose to change the current implementation and work with a more customizable setup, pointing to an implementation class in the settings and allowing that class to manage the entire lifecycle of the feature: scheduling/unscheduling the infra and being smarter about the use of async services.
So please comment and answer:
- Should we have this feature in Zappa?
- Should
async_source
be more customizable? - Should
async_source
also manage the infrastructure?
Tags
Participants from the previous PR
@paulclarkaranz @jeffreybrowning @jneves @lu911 @M0dM @ironslob @vmiguellima
@oliviersels I can help with this feature, once you open the PR I can work on it.
Not sure what are the use cases for delayed asynchronous tasks, but one of the most important thing, I think is have the ability to unload jobs/tasks into a queue where there can wait for being picked up.
I like your approach to handle delayed async, I would make a little suggestion:
"dev": {
..
"async_source": "zappa.async.LambdaAsync",
"async_policy": "cost-effective",
..
}
}
async_policy
allowed values:
-
cost-effective, it will decide based on the delayed amount whether to use
sqs
< 15min delay, or AWS step functions to > 15 min delay - sqs always use sqs
- step always use step functions
Not sure what are the use cases for delayed asynchronous tasks
We use it extensively in our projects for sending reminders to customers. For example:
@task(delay_seconds=60*60*24) # Delay 24 hours
def send_reminder(user):
"""Check to see if user performed action, otherwise send reminder"""
one of the most important thing, I think is have the ability to unload jobs/tasks into a queue where there can wait for being picked up.
In many cases, SNS already supports this queuing and fanout functionality, including retries, filtering and dead-letter queue handling. I no longer suggest adding SQS as an async source, or maybe only in a different PR, after this one.
I like your approach to handle delayed async, I would make a little suggestion:
"dev": { .. "async_source": "zappa.async.LambdaAsync", "async_policy": "cost-effective", .. } }
async_policy
allowed values:
- cost-effective, it will decide based on the delayed amount whether to use
sqs
< 15min delay, or AWS step functions to > 15 min delay- sqs always use sqs
- step always use step functions
I'm reluctant to add configuration settings for the async_source
as I don't see the need for it if we can contain it in a single class.
{
"dev": {
..
"async_source": "zappa.async.SfnOrLambdaAsync",
or
"async_source": "zappa.async.CostEffectiveAsync",
or
"async_source": "zappa.async.SnsAsync",
or
"async_source": "myproject.AwesomeAsync",
..
}
}
In many cases, SNS already supports this queuing and fanout functionality, including retries, filtering and dead-letter queue handling. I no longer suggest adding SQS as an async source, or maybe only in a different PR, after this one.
I have no use case for delayed invocations, but I'd like to give my vote for adding SQS as an async source: I'm not an AWS expert but in my mind:
- SQS is for queues where a given message is only consumed by a single consumer
- SNS is for pub-sub
Meaning SQS is a better fit for a background task queue. Although SQS and SNS have lots of overlap, it just seems to me that in the long run SQS is going to be the option that people want to use for background tasks.
Hi there! Unfortunately, this Issue has not seen any activity for at least 90 days. If the Issue is still relevant to the latest version of Zappa, please comment within the next 10 days if you wish to keep it open. Otherwise, it will be automatically closed.
Hi there! Unfortunately, this Issue was automatically closed as it had not seen any activity in at least 100 days. If the Issue is still relevant to the latest version of Zappa, please open a new Issue.