delayed_job icon indicating copy to clipboard operation
delayed_job copied to clipboard

Allow identifier with multiple workers

Open toncid opened this issue 1 year ago • 7 comments

Hello,

Using identifiers with multiple workers has been prevented for 14 years at least. Maybe it is time to reconsider it.

The PR #1190 loosens up that restriction and allows multiple identified job workers to be started with the following process name format:

delayed_job(.<identifier>)(.<worker_index>)

Both fields are optional, so the default process name remains unchanged (delayed_job).

toncid avatar Jun 05 '23 11:06 toncid

Why are you trying to do this?

albus522 avatar Jun 05 '23 13:06 albus522

Why are you trying to do this?

In order to better utilize a worker node, we would like to spawn as many workers as possible (until a CPU or memory limit is hit).

Yes, I know that we can call the delayed_job -i <identifierX> start command N times, but this is cleaner.

toncid avatar Jun 05 '23 15:06 toncid

Why are you specifying an identifier?

albus522 avatar Jun 05 '23 15:06 albus522

Why are you specifying an identifier?

Because we have multiple nodes running delayed_job workers, so we need job worker identifiers to avoid nodes stepping on each other's toes.

More context for using identifiers can be found e.g. in #866.

toncid avatar Jun 05 '23 15:06 toncid

You are referencing a very old issue that had more than an identifier at play. Delayed Job automatically handles multiple nodes just fine except in rare non-standard scenarios. Are you not able to give each node a unique hostname?

albus522 avatar Jun 05 '23 15:06 albus522

Thanks, that's a good question. I have checked in the DB and I see that jobs are locked as expected (with the identifier and node's hostname + PID):

locked_by: "delayed_job.1685897776 host:ip-172-28-22-203.eu-west-1.compute.internal pid:2618",

Ought to be unique enough.

We have implemented identifiers back in January 2018. One-off jobs were fired multiple times back then, so adding identifiers helped. I do not have much information on why the locking mechanism did not work well back then.

toncid avatar Jun 05 '23 15:06 toncid

Setting my particular use case aside, what about the PR? Looks good?

I don't see much harm with enabling it, especially since a Bash workaround is supported anyway:

delayed_job -i 123456.0 start
delayed_job -i 123456.1 start
delayed_job -i 123456.2 start
delayed_job -i 123456.3 start

toncid avatar Jun 07 '23 06:06 toncid