dramatiq icon indicating copy to clipboard operation
dramatiq copied to clipboard

Add an actor decorator argument to compute backoff time using a function

Open arseniiarsenii opened this issue 1 year ago • 3 comments

What version of Dramatiq are you using?

1.14.2

Description

Right now we can provide per-actor settings for the retry policy using a predicate in the actor decorator's argument retry_when. However, there is no such option to override backoff times used for retrying tasks.

The case: I am using an actor to send a webhook to an external service. We have agreed that a webhook should be retried three times if the first attempt fails. Second attempt after 1 minute, third attempt after 5 minutes, fourth attempt after an hour. It is not easy to implement such logic in dramatiq at the moment.

Having spent quite some time reading docs, issues, and sources I've come up with this workaround using CurrentMessage middleware:

@dramatiq.actor(queue_name="send_partner_webhook", max_retries=3)
def send_partner_webhook_actor(webhook: PartnerWebhook) -> None:
    try:
        send_partner_webhook(webhook)
    except PartnerWebhookRequestFailedError as e:
        message: dramatiq.Message = dramatiq.middleware.CurrentMessage.get_current_message()
        retries_so_far: int = message.options.get("retries", 0)
        backoff = {
            0: 60_000,  # backoff before first retry - 1 min
            1: 300_000,  # backoff before second retry - 5 min
            2: 3_600_000,  # backoff before third retry - 1 hour
        }
        if retries_so_far not in backoff:
            raise e
        raise dramatiq.errors.Retry(str(e), delay=backoff[retries_so_far])

I think it should be easier to achieve this behavior using an argument similar to retry_when:

def backoff_factory(retries_so_far: int) -> int:
    backoff = {
         0: 60_000,  # backoff before first retry - 1 min
         1: 300_000,  # backoff before second retry - 5 min
         2: 3_600_000,  # backoff before third retry - 1 hour
     }
    return backoff.get(retries_so_far, 3_600_000)


@dramatiq.actor(queue_name="send_partner_webhook", max_retries=3, backoff_factory=backoff_factory)
def send_partner_webhook_actor(webhook: PartnerWebhook) -> None:
    send_partner_webhook(webhook)

Thank you for your work, I hope that this feature request makes it into a future release!

arseniiarsenii avatar Sep 25 '23 15:09 arseniiarsenii

I think it's also depends on Exception type. And passing (retries, exception) to backoff_factory will also consistent with retry_when which already handle this parameters

spumer avatar Sep 25 '23 16:09 spumer

I agree

arseniiarsenii avatar Sep 25 '23 16:09 arseniiarsenii

Hey @arseniiarsenii I have a somewhat similar problem https://github.com/Bogdanp/dramatiq/issues/605 that implementation you suggested could resolve partially or fully.

Have you noticed in your tests that retry is performed more times than the specified limit?

bvidovic1 avatar Jan 25 '24 11:01 bvidovic1