temporal
temporal copied to clipboard
Provide priority task queues
Is your feature request related to a problem? Please describe. Currently there is no way to assign priority to a task and also ensure fairness in workers executing tasks.
Quote from sjansen on community.temporal.io
Imagine, for example, a parent workflow that starts an expensive child workflow for each user in the account. If multiple accounts start the parent workflow near the same time, an account with a much larger number of users could monopolize all available workers by simply being first to queue up a large number of activities associated with each user. When there’s no contention, it’s desirable for an account to be able to use 100% of available workers but as soon as there’s contention between accounts it’s desirable to attempt fair scheduling in order to keep latency proportional to account or workflow size.
Obviously it would be possible to partition capacity by creating separate queues for each account, but the result is either potentially significant idle capacity or latency waiting for workers to scale out.
Describe the solution you'd like Be able to assign priority to a task, which ensures that the task queue is always ordered based on the highest priority. Fair scheduling of tasks is important too, so that high priority work doesn't consume all workers leaving lower priority work queued.
Describe alternatives you've considered Multiple task queues which don't necessarily solve this problem because it allows for idle workers or latency in scaling up workers.
Additional context This feature request stems from a question & discussion on the community website: https://community.temporal.io/t/rate-limiting-based-on-metadata/385/
BTW, first the there must be consensus on the design
the feature request will require the following:
SDK: allow workflow logic providing priority of workflow / activity
Server:
- propagate the above priority within history service logic
- pass the above priority to matching service for dispatch
- new priority queue impl
- refactoring matching service for the new priority queue impl
- refactoring matching service's DB schema & impl adding support of priority
- forward / backward compatibility
Are there any plans to support priority taskqueue or workflow?
Roughly how much work would this be to implement? We're pretty interested in having this sort of functionality and would be happy to help out with implementation here.
Also super interested. We would like to replace a Celery setup (esprecially Canvas workflows) with Temporal, but we make heavy use of RabbitMQ Message Priorities which is not available in Temporal.
I'm exploring a similar concept, something I'm curious about is how being able to assign task priority would unlock fairness in execution?
Priorities and fairness are definitely features we're looking at as we evolve the matching portion of Temporal. I think it's fair to say that it will be a lot of work to develop and productionize, and it's going to involve the internals in the matching service, so it might be difficult for outside developers to work on. At this point, the most helpful thing might be to describe your requirements in some more detail, so we can keep them in mind as we do the design (since "priority" and "fairness" sometimes mean slightly different things to different people.) E.g. how do you want to specify priorities, with what granularity, do you need reserved capacity for different priorities?
A few more notes:
- From my point of view, priority doesn't "unlock" fairness, but they are two features that affect the order that tasks are dispatched in, and they are often desired to be used together. So it makes some sense to work on both at once, or at least design them together.
- One thing that we're unlikely to ever offer is guaranteed priorities or ordering, because of task queue partitions that need to work independently for scalability.
In our case, we are looking for a replacement for Celery which itself uses RabbitMQ queue priorities. If there is for example only one worker it will always take the task with the highest priority (even if that task is later in the queue than others with lower priority).
Is there any work in progress?
Is it completed? Where can I find more information?
Sorry, the close/open was a github issues mishap. This is still planned but no ETA yet.
Any update on this? Lack of prioritization is not infrequently a reason for abandoning Temporal in my practice...
Dealing with large work spikes from individual clients is common in our systems. This feature would greatly simplify management and help make a stronger case for adopting Temporal.
Dealing with large work spikes from individual clients is common in our systems. This feature would greatly simplify management and help make a stronger case for adopting Temporal.
Can't you run multiple task queues, and manage it by swapping which ones you want to handle?
this would be very useful for us too :) primarily the priority queue idea
Having priorities and fairness would definitely help to increase Temporal adoption in our company.
Our use case for priority is that we have the same operations used for real-time use cases, such as when user click a button in UI and expect immediate answer, and reused for asynchronous batch processing, but with much larger scale (can make all workers fully utilized for minutes-hours per each batch). Ideally we'd want to reuse all the same infrastructure, avoid over-provisioning & idle resources, and just have Temporal workers execute items with higher priority before going to lower priority ones (ie have worker activity queue be sorted by priority). Currently we work-around this by slightly over-provisioning our infra and reserving some capacity for real-time use cases while limiting the activity tasks concurrency for workers. Sadly it is not always enough to process sudden influx of real-time requests, but also when there aren't any real-time requests that leaves the reserved capacity idle, leading to inefficient utilization.
Our use case for fairness is to allocate available service capacity equally for each group (user/tenant/other grouping dimension), but at the same time also we are OK with allowing even a single group to consume all of available capacity if nobody else is using the service at the moment. So far we haven't found a way to achieve it with Temporal, we're still using our bespoke external scheduler component with centralized database that schedules items 1by1 up to pre-configured amount of concurrently running jobs, I don't really like it much because it's so hard to reuse across services. Would absolutely love if Temporal offered a similar feature out of the box, maybe it could be in form of job scheduler configuration where we could define dynamic priority formula with ability to lookup # of workflow/activities with same workflow/activity attributes already being executed (at the time of scheduling).
Thanks @darkms that's really good input.
The team is actively looking into this feature so we welcome feedback on use cases.
I'm using Temporal at my job to process large batches of data that can take multiple days to complete. This creates hundreds or thousands of workflows in Temporal that all "start" at the same time. This works fine, but one big issue is that it can take hours before the first workflow finishes, even though a single workflow only takes a couple of minutes to finish. This is because Temporal spreads the limited workers around to progress on all workflows concurrently. I want Temporal to prioritize finishing workflows higher than progressing other workflows.
In my case, there is almost no value in having 1000 workflows with the first 2 activities completed, it would be far more valuable to have 200 workflows completed and 800 unstarted. This could be achieved by prioritizing later activities in a workflow higher than earlier activities, which is why I'm very interested in this feature.
The team is actively looking into this feature so we welcome feedback on use cases.
Is there any guidance on how far in the future priority queues are? Asking to know how much time/effort/energy we should invest in crafting our own solution that is robust vs. waiting for one to be provided.
Thank you everyone for your patience! Yes, we (@dnr, @stephanos, myself, several others) are actively working on features for task queue level priority and fairness. At a high-level, here's where we're heading directionally.
We plan to ship features in the order described here starting from "least complex" to the "most complex". Timing-wise, we will start to roll this out mid-year 2025 and incrementally add more features as we go.
Key features
Simple priority: This is an integer in the range [1, 5], with default 3. Lower numbers are a higher priority. Tasks are dispatched strictly in order i.e. all “1” tasks come before all “4” tasks in a matching backlog.
The intended semantics of priority are to differentiate e.g. “interactive”/”real-time” vs “batch”/”idle” work. Also, it can be used as an “emergency priority” mechanism for a user to say “run this right now, ignoring everything else”.
These fixed priorities are mostly exact (exact within one task queue partition). You will also be able to define rate limits per priority-level.
Fairness by key and weight: Fairness is controlled by two values: a short string key (up to 64 bytes) and a positive float weight in [1e-6, 1e6]. The default is an empty string for the key and 1.0 for the weight.
Fairness is intended for a few different purposes (note that this is a non-exhaustive list and can be used in conjunction):
- In multi-tenant situations, giving each tenant an approximately equal (or weighted unequal) share of resources.
- Defining priority tiers with weighted capacity, e.g. “high”/”medium”/”low” with weights 70/20/10, to allow spending more resources on more important work without starving less important. If one class is not present, the other classes can expand to use all capacity.
- Activity type could be used as a fairness key to avoid starving specific activities on a shared task queue.
You will be able to define rate limits per fairness-level.
Tasks are dispatched such that the ratio of dispatch rates between two keys is approximately equal to the ratio of their weights. i.e. for two keys with the same weight, their tasks should get dispatched at approximately the same rate, no matter how many tasks each of them has in the backlog. This prevents “heavy” keys from disrupting “light” keys.
There is no limit on cardinality of fairness keys, but if more than a certain number are used, the accuracy may degrade.
Fairness is approximate across both time and space: a) it may take some time for the fairness mechanism to react to a change in distribution of fairness keys, and b) not all decisions would be made the same as if we could use infinite resources for tracking past and future tasks.
Ordering i.e. arbitrary priorities: The ordering key is given by an arbitrary positive integer (64-bit) with no limit on cardinality with default of 1. We attempt to dispatch tasks in order of ordering key (small to large), after taking fairness into account.
Ordering is intended for situations like:
- Using a continuous value like workflow start time, to order activities of older workflows before activities of newer workflows. This reduces the negative effects of starting many workflows that compete for resources, which could otherwise starve the older ones.
- Similarly, using original activity start time to prioritize activity retries over new activities, or deadline to act as a deadline scheduler.
- Users who want precise control over activity execution order and want to supply an arbitrary scheduling ordering.
Levels apply in order from priority to fairness to ordering: Priority first, then fairness applies within a priority level, then ordering applies within a fairness key. Priority and ordering are exact (at least within partitions), while fairness may be approximate.
Please note that these features are in active development, so everything described here is subject to change as we build and iterate. To stay in touch with the latest developments or to provide feedback / ask questions, please join the channel #priority-fairness in Temporal's community Slack.
I want Temporal to prioritize finishing workflows higher than progressing other workflows.
That's exactly our use case!
We don't even need to "start" all workflow at the same time. By starting batches of workflows within a timeframe, we want the first ones to finish earlier than the last ones without much contention. Imagine the batches keep coming, eventually we want the later batches to not even start (not the workflow, not the activities) until earlier batches completely finish.
Eagerly waiting for this feature. We have Celery and RabbitMQ implementation which we want to migrate to temporal. Priorities are a must for us otherwise some workflows will never complete. E.g. if activity A within a workflow has returned, then next activity B needs to be finished within 5 mins, otherwise the results of activity A are invalid and activity A itself needs to be retried.