Rate-Limit / Concurrency APIs
Hi Restate team 👋
After trialling Restate alongside Inngest, we’re missing two things that Inngest already surfaces:
- A declarative
concurrency/rateLimitknob (e.g. “max 10 parallel Stripe calls” or “100 GPT calls / min”). - A UI/CLI view showing how many invocations are queued, running, retrying, etc., per limit key.
Could you share:
- Road-map—are built-in concurrency or rate-limit annotations on the horizon?
- Best practice—until such features land, how would you recommend implementing a clean limiter with Virtual Objects?
Thanks for the amazing work!
I'm also encountering a similar issue. Although Inngest includes all the features I need, their self-hosting solution doesn't meet production requirements. Currently, I have to implement it myself on the app side using Redis.
Hi @michael-dm, thanks for the feedback!
A declarative concurrency / rateLimit knob (e.g. “max 10 parallel Stripe calls” or “100 GPT calls / min”).
This is something we have on our roadmap, and we plan to work on it soon. The details of how it will look are to be defined yet, but most likely it will be some in code annotation/configuration option of the handlers to define the maximum concurrency. When we have more details i'll be happy to followup here!
A UI/CLI view showing how many invocations are queued, running, retrying, etc., per limit key.
Once we get the declarative concurrency knob, we'll surely expand our UI to observe the queues state. In the meantime, when using virtual objects, you can already find our which invocations exist per a given key, in a given state, by using the search bar at the bottom of the invocations page:
Related to this, we've on our plan another page in the UI that shows per virtual object key stats, which would be more closer to what you describe i think.
Regarding rate limiting examples, at the moment we have an example that shows how you can implement that using the existing Restate functionalities, check it out: https://github.com/restatedev/examples/blob/main/typescript/patterns-use-cases/README.md#rate-limiting
Really look forward to seeing this. In my system, I'm having a "lower-level" service that serves other workflows, and I would want it to be rate-limited