Polly Best practices for using Polly with Azure durable functions to rate limit HTTP calls

I've been tasked with building an Azure durable function app that is aware of rate limits on HTTP endpoints. The code from #666 appears to be the solution when coupled with typed HttpClients, but not having a lot of experience with either durable functions or Polly, I'm wondering how best to implement this (or if it's even a supported scenario).

Essentially I'm going to have a timer trigger, that fires an orchestrator, that fires off an activity that gets a list of data. Using that list the orchestrator will then fire off a number of calls to activities, each of which will perform a single HTTP request.

The catch here is that the HTTP endpoints are rate-limited, but don't support status code 429 or the Retry-After header, so I need to be the one limiting my requests to them. Hence the (potential) need for the code from the aforementioned PR.

The naïve solution would be to fire off list.Count durable functions activities from the orchestrator and let Polly handle rate-limiting of the actual calls. Then, each activity attempts to make its HTTP request and if the request fails due to exceeding the rate limit, the activity catches that and returns a "please requeue me" status to the orchestrator. The problem with this solution, of course, is that most of those activities will initially fail and need to be re-queued, which is a waste of compute power and therefore money, so I don't want to do that.

A better option is to batch the activities into sets of N items, where N is the maximum number of requests the endpoint allows, and run each batch. After a batch has completed, review the total time it took to complete said batch:

If that duration is less than the rate limit duration, have the orchestrator sleep for the remainder of the rate limit duration
Otherwise, run the next batch

With the above solution, I technically don't even need Polly, but it feels a little... coarse, and I'm worried there are pitfalls I'm not aware of. Am I overthinking this, or is there a simpler/more effective solution that I'm missing?

Aug 09 '21 12:08 IanKemp

If you're limiting yourself at the client (rather than the server doing it), would maybe a bulkhead policy work for you?

With an appropriate limit on the number of actions that are allowed to wait to proceed when the bulkhead is "full", it would self-limit itself.

But yeah otherwise if that doesn't work, without the server being co-operative to give you some sort of feedback to know when to re-submit (like 429 and Retry-After), maybe Polly isn't the right off-the-shelf solution for you?

Aug 09 '21 12:08 martincostello

@martincostello thanks for the response! Bulkhead could be useful, except for the fact that it doesn't do (requests/duration) limiting, but perhaps I could combine it with the rate-limiting code from #666 using PolicyWrap?

As for the "right solution", I'm not sure what that is at the moment, hence this question (which may be better directed at the durable functions guys). All I know is that I'd much rather use something like Polly, that's been written by experts in the field and is backed by extensive testing (both unit tests and in the field), than hand-roll something kludgy.

Ideally this question would be best asked on Stack Overflow, except getting a decent answer there nowadays on a technical question, is generally a lost cause.

BTW, I've read https://github.com/App-vNext/Polly/issues/535#issuecomment-441369080 and my issue with Dylan's comment there around bulkheading is that the durable functions runtime doesn't provide (AFAIK) a programmatic way to set those limits; they're configuration values (https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-perf-and-scale#concurrency-throttles), which effectively forces the end-user (i.e. me) to do the math around slicing and dicing request numbers, as opposed to programmatically doing it, which for me somewhat defeats the purpose...

Aug 09 '21 12:08 IanKemp

perhaps I could combine it with the rate-limiting code from #666 using PolicyWrap?

Certainly worth a try to see if you can get any mileage out of it.

I've not used Azure Durable Functions myself, so it might be that people who've had experience of it might be able to give you some better guidance more specific to your use case.

Aug 09 '21 13:08 martincostello

@IanKemp any luck with this?

Oct 18 '22 11:10 martinlarosa

Polly Polly copied to clipboard

Best practices for using Polly with Azure durable functions to rate limit HTTP calls

Polly
Polly copied to clipboard