arq icon indicating copy to clipboard operation
arq copied to clipboard

Question: How can one self-re-enqueue a job when using _job_id?

Open joshwilson-dbx opened this issue 1 year ago • 4 comments

I have a use case where once a job completes, i would like it to continuously re-schedule itself for another run at some point in the future. I would also like to ensure that only one instance is queued/running at any given time, so I'm using the _job_id parameter when enqueuing. I cannot use the cron functionality as the delay time is somewhat dynamic and not easily translated to cron.

Options that I've explored so far:

  1. Simply call redis.enqueue_job(..., _job_id=ctx['job_id']) from within the job itself
    • This doesn't work since the current job is still active and prevents the new enqueuing
  2. Raise a Retry exception from within the job after the work has completed
    • This seems like an abuse of this functionality
    • Will likely run into trouble with the _expires and max_tries settings
  3. Set keep_result=0 and enqueue a second job (different name) with a small delay that in turn re-enqueues the original job again
    • Works but is cumbersome and may introduce a race condition
    • Needs a second job function just to enqueue the primary job again
  4. Use keep_result=0 and re-enqueue in the after_job_end function to be sure the job and result keys are no longer present so the re-enqueue can occur.
    • Will probably need to dedicate a specific queue and workers for these jobs
    • Risks the programmer error of enqueuing in the wrong queue

Is there a better way to do this?

joshwilson-dbx avatar Oct 12 '23 23:10 joshwilson-dbx

Hi @joshwilson-dbx, I think we have somewhat similar use cases (1 particular job that can be triggered multiple times, only one should exist at one point in time, last one is the only one I am interested in), thought I'd reference my issue here https://github.com/samuelcolvin/arq/issues/394 to vote up this scenario and also in case it helps to read about the problems I had encountered with aborting/deleting.

gerazenobi avatar Oct 19 '23 13:10 gerazenobi

Thanks for taking a look @gerazenobi. I agree our issues both arise from not being able to manage redundant job behavior well with the current API. I don't know what the solution would be yet, but perhaps we could add additional parameters to enqueue_job() to specify how to manage these collisions.

It doesn't seem like this kind of configuration would belong on the worker side of things. Maybe it belongs on the Job class?

I also wonder if there's a need to formalize the concept of a Queue in the code. Right now it looks like it's just a convention specified by passing around a queue name between the various job and worker functionality.

joshwilson-dbx avatar Oct 19 '23 15:10 joshwilson-dbx

@joshwilson-dbx In my mind, ideally and following more or less the current API, enqueue_job would accept not only _job_id but something like override=true or similar: allowing any previous job to be ignored. The tricky part would be if the older job is already started (I am not familiar about how would aborting behave of if even possible).

In our use case, the linked ticket: we work around by tracking ourselves (in redis we well) which is the last job that I am actually interested; subsequent jobs of the same type continue to override this value; whenever we need the result of the job we get the arq id from this saved value.

gerazenobi avatar Oct 20 '23 15:10 gerazenobi

I describe a (experimental?) approach to re-enqueue using JobStatus but without job_id here: https://github.com/samuelcolvin/arq/issues/457 , although not sure whether this would actually help in your case too.

davidhuser avatar May 18 '24 15:05 davidhuser