rq icon indicating copy to clipboard operation
rq copied to clipboard

Make registry cleaning lock expiry configurable

Open eswolinsky3241 opened this issue 1 year ago • 2 comments

We run our workers on Kubernetes using a cluster autoscaler. This results in a decent number of abandoned jobs, and we use a custom exception handler to requeue jobs that get abandoned. Given the frequency of jobs being abandoned, we've set the maintenance_interval of our workers to one minute.

w = Worker(
    argv[1:],
    name=os.environ["HOSTNAME"],
    connection=client,
    exception_handlers=[abandoned_job_handler],
    maintenance_interval=60,
)

However on inspection, it looks like this interval is separate from how often the clean_registries method is called. That method is only run if a clean_registries lock is acquired, and that lock only expires every 15 minutes. Meaning our abandoned jobs are not being requeued as frequently as we'd like.

https://github.com/rq/rq/blob/a8209391fffe3bad195264e6ed22724e36ae1fb5/rq/queue.py#L251

Is it possible to make this a constructor argument on the Worker so that queues are cleaned more frequently?

eswolinsky3241 avatar Jan 10 '24 16:01 eswolinsky3241

Yeah, so apparently there's a bug that the cleaning lock does not get deleted after maintenance is done. There's a PR addressing this issue, but it's not done yet.

If you want to submit a quick PR to delete the lock once maintenance tasks is done, it would be great.

selwin avatar Jan 13 '24 09:01 selwin

@selwin PR submitted: https://github.com/rq/rq/pull/2024

eswolinsky3241 avatar Jan 18 '24 02:01 eswolinsky3241