procrastinate Retry stalled retry stalled jobs jobs :)

We have a retry_stalled_jobs job, pretty much per the docs:

@app.periodic(cron="* * * * *", queueing_lock="uptool.tasks.retry_stalled_jobs", lock="uptool.tasks.retry_stalled_jobs")
@app.task()
async def retry_stalled_jobs(timestamp: int):
    stalled_jobs = await app.job_manager.get_stalled_jobs()
    for job in stalled_jobs:
        await app.job_manager.retry_job(job)

But of course that job can get stalled too:

And now everything is stalled.

What is the recommended workaround?

Why does the user even need to handle these scenarios manually, shouldn't procrastinate be taking care of it's internal state (including detecting stalled jobs) automatically?

Jun 02 '25 21:06 jakajancar

Hey @jakajancar,

You indeed spotted an oversight in the documentation.

What I can suggest is to

remove the lock for this retry stalled jobs task
filter out this task from being retried in the list of returned stalled jobs
mark any stalled job that corresponds to this task as failed

We need to amend the documentation.

If that doesn't work, please report back.

There is certainly value in integrating this into the library.

That might require some more thinking on what should the default behaviour be.

On the other hand, making the consumer of this library responsible for retrying stalled jobs yields the most flexibility.

Jun 03 '25 09:06 onlyann

Thanks @onlyann. I resolved the issue, so please just consider this a feature request, that Procrastinate should handle it's own "garbage collection".

Jun 03 '25 19:06 jakajancar

The docs only have queueing_lock whereas I had both queueing_lock and lock. If I didn't have the latter, I think the problem wouldn't have occurred, so closing.

Jul 25 '25 22:07 jakajancar