SchedulerBundle icon indicating copy to clipboard operation
SchedulerBundle copied to clipboard

feat(worker): Parallel execution

Open checkphi opened this issue 5 years ago â€Ē 13 comments

Currently it seems like all scheduled jobs are executed sequentially. Would it be possible to leverage the messenger system somehow so that each job (or a defined number of jobs) can run in parallel?

checkphi avatar Apr 15 '21 12:04 checkphi

Hi @checkphi 👋ðŸŧ

Yes, all tasks are executed sequentially, to be honest, Messenger cannot natively consume messages in parallel so IMHO, there's no benefit of working on something related to Messenger.

I have plans to work on something that can execute tasks in parallel thanks to Fiber in PHP 8.1 but as it cannot be implemented in 8.0, it force me to enable it only for >=8.1 🙁

What do you have in mind to bring parallel execution?

Guikingone avatar Apr 15 '21 14:04 Guikingone

Maybe I understood the documentation and concept of messenger wrong, but is it not possible to run multiple messenger:consume workers at the same time? Then whichever is "free" will work on that.

In other words, if we start 3 additional "scheduler workers" then we could process 3 scheduled jobs in parallel?

checkphi avatar Apr 15 '21 14:04 checkphi

Thanks to Supervisord, you can trigger multiple worker (same concept applies to this bundle), each worker will consume messages available at the moment.

In other words, if we start 3 additional "scheduler workers" then we could process 3 scheduled jobs in parallel?

In theory, yes, keep in mind that a lock is applied to every task before executing it so it may depends on the transport which is used (Doctrine use a resource lock, same things applies to Redis).

Guikingone avatar Apr 15 '21 15:04 Guikingone

So you say that I should simple start multiple scheduler:consume --wait workers which would then almost result in the desired behaviour?

checkphi avatar Apr 15 '21 16:04 checkphi

Yes, for now, I don't have an other solution sadly 🙁

If you have one, feel free to contribute and we'll see if it can be integrated.

Guikingone avatar Apr 16 '21 06:04 Guikingone

I'm thinking about multiple solutions which I would be happy to discuss. Can I contact you somehow? I think the discussions would exceed the scope of this issue.

checkphi avatar Apr 23 '21 09:04 checkphi

Hi @checkphi 👋ðŸŧ

Consider using discussions: https://github.com/Guikingone/SchedulerBundle/discussions

This way, we can track the discussion and open related issues if required 🙂

Guikingone avatar Apr 23 '21 11:04 Guikingone

https://github.com/Guikingone/SchedulerBundle/discussions/65

checkphi avatar Apr 23 '21 15:04 checkphi

Hi @Guikingone , I'm using Doctrine for transport and store, so i would to consume multiple tasks calling multiple scheduler:consume at the same time but the first process locks all the tasks so the next scheduler:consume process can't consume anything. How I can configure the scheduler to lock only the consumed task?

grimgit avatar Mar 21 '22 16:03 grimgit

Hi @grimgit 👋ðŸŧ

Actually, the call to scheduler:consume lock all the tasks that can be locked: https://github.com/Guikingone/SchedulerBundle/blob/main/src/Worker/AbstractWorker.php#L200.

How I can configure the scheduler to lock only the consumed task?

What's the idea behind this approach? If I'm right, locking the consume tasks isn't useful as they're "consumed" and not retrieved until next minute ðŸĪ”

Guikingone avatar Mar 21 '22 16:03 Guikingone

Suppose to have a very long task, for example 1 hour task, all the others tasks are locked and will be delayed of 1 hour, even if i'm running multiple consumer. Instead if the consumer is able to lock only the running task, others consumers can lock and consume the others available tasks.

grimgit avatar Mar 22 '22 08:03 grimgit

Suppose to have a very long task, for example 1 hour task, all the others tasks are locked and will be delayed of 1 hour, even if i'm running multiple consumer.

Actually, tasks are locked "per processes" depending on the store that you're using.

Here's the explanation of what happen internally:

  • Process A call scheduler:consume, it lock all the tasks that can be locked at the moment.
  • Process B call scheduler:consume, if tasks are available (or due for the correct wording), it consumes them and stop (you can wait for tasks using --wait option).

If the process B detect that tasks are locked, it cannot unlock them and wait until they're released by A, so, by definition, we can say that B consume tasks one by one and that A lock them one by one, I agree on the fact that we lock tasks one by one but at the same time 🙁

Instead if the consumer is able to lock only the running task, others consumers can lock and consume the others available tasks.

I think I get the point that you're mentioning (stop me if I'm wrong):

  • Process A call scheduler:consume, it lock the task one by one if it can acquire them.
  • Process B call scheduler:consume, if tasks are available (or due for the correct wording), it consumes them one by one by locking them and stop (you can wait for tasks using --wait option).

Am I right?

If I am, yes, it could be improved, feel free to submit a PR to improve it, we can discuss about the implementation / improvements 🙂

Guikingone avatar Mar 22 '22 08:03 Guikingone

It's correct, process A should lock only the first available task before start to consume it so the process B can acquire the lock of another task. Suppose to have two tasks, Task 1 a time consuming task that can run for an hour and Task 2 a high frequency task, as could be a mail queue processor that should run every minute. With the current lock design the Task 2 can't be consumed by another process while Task 1 is running because the task iÃŽs locked by the first process that is consuming the Task 1 so all emails will be delayed .

So the consumer process should:

  • While there is an available task
    • Acquire the task lock
    • Consume the task
    • Release the task lock

grimgit avatar Mar 22 '22 09:03 grimgit