procrastinate icon indicating copy to clipboard operation
procrastinate copied to clipboard

Priority functionality to procrastinate queues

Open ulfet opened this issue 4 years ago • 1 comments

Procrastinate provides queues for tasks, which I extensively use in my coding. Kudos for that!

Those queues are regular queues. With priority queues as a feature, task selection from queues would become much more flexible.

I sifted through procrastinate code to find out whether procrastinate provides such a functionality, yet, if my memory and skills serve me right, I did not find this.

In-depth

Assume there is a queue q, and the following tasks in it:

id task args
1 task some_args_1
2 task some_args_2
3 task some_args_3

The task with id=1 goes first, and the id=2, and finally id=3.

The cost of tasks are high, and in my case, they need to be ordered to be picked in certain order. And the arrival of tasks are not known beforehand. While the priority of each task is known at their enqueueing time, such information cannot be reflected to procrastinate's DB records and its task selection routine as of now.

If that was the case that procrastinate works with and pick from a queue based on priority, then it would allow for different workflows:

id task args priority
1 task some_args_1 5
2 task some_args_2 10
3 task some_args_3 15

Then, the first would have been id=3, and then id=2, and finally id=1.

In interactive environments, there might be more enqueue operations that is happening that would make prioritizing tasks feasible, so that a task that has been enqueue very late yet has high priority can be given higher preference by the task selection algorithm.

schedule_in seems to be the closest one to prioritization, that it allows delaying of a task for a later time (for it being picked up). However, abstraction of giving weights to task at their enqueueing is way more understandable than making up schedule_in arguments.

A small change in SQL configuration might allow it to work:

task = procrastinate.task(lambda _: _, queue=queue_name, name=task_name, queueing_lock=str(entity['id']))
await task.defer_async(entity=entity)
task = procrastinate.task(lambda _: _, queue=queue_name, name=task_name, queueing_lock=some_string)
await task.defer_async(entity=args, priority=procrastinate.priority.HIGH) # enum or similar

Questions

  1. is such a functionality already available in procrastinate?
  2. if not, is such a functionality planned for the future?

ulfet avatar Feb 09 '21 17:02 ulfet

Procrastinate provides queues for tasks, which I extensively use in my coding. Kudos for that!

Thanks for using the project !

Is such a functionality already available in procrastinate?

No. As of today, the things impacting the order of task consumption are twofold: filtering rules and ordering rules. Filtering rules:

  • Tasks are not considered for execution if another task with the same lock value is running
  • Tasks are not considered for execution if their scheduled_at is a date in the future

Ordering rules:

  • Between two otherwise identical tasks, the one with the lowest ID will be executed first.

This allows a (more or less) FIFO order, given enough time, and except in cases where a worker is killed while running a locked tasks, all tasks will be executed.

If not, is such a functionality planned for the future?

That could be interesting. It would add a new ordering rule, but I believe it would not change completely how procrastinate works. We have to keep in mind that it could be a dangerous tool, in that if you have a setup in which there's always some tasks queuing and workers are never idle (which I think is not frequent, as it would require your average task deferring rate to be exactly equal to your task consumption rate, but if you have an autoscaling feature or something, it could be the case), and if you have a task with priority -1 where all other tasks are at 0, then that task could take months to execute. But as I said, I'm not sure this is a standard setup. Making sure your workers are at least sometimes idle is a new requirements if we implement priority. It's was not the case before. As long as we make this clear in the documentation, I'm all in :)

Do you want to work on a PR or something ? Happy to provide help if it's the case !

ewjoachim avatar Feb 10 '21 08:02 ewjoachim