feat(worker): Parallel execution
Currently it seems like all scheduled jobs are executed sequentially. Would it be possible to leverage the messenger system somehow so that each job (or a defined number of jobs) can run in parallel?
Hi @checkphi ððŧ
Yes, all tasks are executed sequentially, to be honest, Messenger cannot natively consume messages in parallel so IMHO, there's no benefit of working on something related to Messenger.
I have plans to work on something that can execute tasks in parallel thanks to Fiber in PHP 8.1 but as it cannot be implemented in 8.0, it force me to enable it only for >=8.1 ð
What do you have in mind to bring parallel execution?
Maybe I understood the documentation and concept of messenger wrong, but is it not possible to run multiple messenger:consume workers at the same time? Then whichever is "free" will work on that.
In other words, if we start 3 additional "scheduler workers" then we could process 3 scheduled jobs in parallel?
Thanks to Supervisord, you can trigger multiple worker (same concept applies to this bundle), each worker will consume messages available at the moment.
In other words, if we start 3 additional "scheduler workers" then we could process 3 scheduled jobs in parallel?
In theory, yes, keep in mind that a lock is applied to every task before executing it so it may depends on the transport which is used (Doctrine use a resource lock, same things applies to Redis).
So you say that I should simple start multiple scheduler:consume --wait workers which would then almost result in the desired behaviour?
Yes, for now, I don't have an other solution sadly ð
If you have one, feel free to contribute and we'll see if it can be integrated.
I'm thinking about multiple solutions which I would be happy to discuss. Can I contact you somehow? I think the discussions would exceed the scope of this issue.
Hi @checkphi ððŧ
Consider using discussions: https://github.com/Guikingone/SchedulerBundle/discussions
This way, we can track the discussion and open related issues if required ð
https://github.com/Guikingone/SchedulerBundle/discussions/65
Hi @Guikingone , I'm using Doctrine for transport and store, so i would to consume multiple tasks calling multiple scheduler:consume at the same time but the first process locks all the tasks so the next scheduler:consume process can't consume anything. How I can configure the scheduler to lock only the consumed task?
Hi @grimgit ððŧ
Actually, the call to scheduler:consume lock all the tasks that can be locked: https://github.com/Guikingone/SchedulerBundle/blob/main/src/Worker/AbstractWorker.php#L200.
How I can configure the scheduler to lock only the consumed task?
What's the idea behind this approach? If I'm right, locking the consume tasks isn't useful as they're "consumed" and not retrieved until next minute ðĪ
Suppose to have a very long task, for example 1 hour task, all the others tasks are locked and will be delayed of 1 hour, even if i'm running multiple consumer. Instead if the consumer is able to lock only the running task, others consumers can lock and consume the others available tasks.
Suppose to have a very long task, for example 1 hour task, all the others tasks are locked and will be delayed of 1 hour, even if i'm running multiple consumer.
Actually, tasks are locked "per processes" depending on the store that you're using.
Here's the explanation of what happen internally:
- Process
Acallscheduler:consume, it lock all the tasks that can be locked at the moment. - Process
Bcallscheduler:consume, if tasks are available (or due for the correct wording), it consumes them and stop (you canwaitfor tasks using--waitoption).
If the process B detect that tasks are locked, it cannot unlock them and wait until they're released by A, so, by definition, we can say that B consume tasks one by one and that A lock them one by one, I agree on the fact that we lock tasks one by one but at the same time ð
Instead if the consumer is able to lock only the running task, others consumers can lock and consume the others available tasks.
I think I get the point that you're mentioning (stop me if I'm wrong):
- Process
Acallscheduler:consume, it lock the task one by one if it can acquire them. - Process
Bcallscheduler:consume, if tasks are available (or due for the correct wording), it consumes them one by one by locking them and stop (you canwaitfor tasks using--waitoption).
Am I right?
If I am, yes, it could be improved, feel free to submit a PR to improve it, we can discuss about the implementation / improvements ð
It's correct, process A should lock only the first available task before start to consume it so the process B can acquire the lock of another task.
Suppose to have two tasks, Task 1 a time consuming task that can run for an hour and Task 2 a high frequency task, as could be a mail queue processor that should run every minute. With the current lock design the Task 2 can't be consumed by another process while Task 1 is running because the task iÃŽs locked by the first process that is consuming the Task 1 so all emails will be delayed .
So the consumer process should:
- While there is an available task
- Acquire the task lock
- Consume the task
- Release the task lock