pg-boss icon indicating copy to clipboard operation
pg-boss copied to clipboard

Extend the expiry of jobs while working on them

Open boredland opened this issue 3 months ago • 4 comments

Hi there! Thank you so much for this fantastic project!

I have a scenario, where it is quite impossible to know beforehand, how long a certain job will take, and it also is quite the range (1 second vs 3 hours for example). While I probably could use the maximum as my expireInSeconds, I am wondering, if, while working on it, I could extend the expiry.

From working with nsq for example, I know an API msg.touch() that extends the expiry by the configured expiration whenever called. This helped me a lot in such cases.

boredland avatar Sep 14 '25 08:09 boredland

The bias here is that long-running jobs are problematic, since they sometimes correlate to long-running functions. Why are they problematic? I can't claim to be the authority on why most people feel this way, but in my experience it was always related to the difficulty in keeping infrastructure running the entire time (if it isn't always required) and also the difficulty in resuming a task if it fails mid-way for any other reason than the task itself (network failures, etc).

For this reason, you may want to switch to more of an async job pattern for the work, such as "task-started", followed by "task-monitor" jobs that are deferred and designed to monitor the state, then determine when the task is finished.

timgit avatar Sep 23 '25 21:09 timgit

I wouldn't say that long-running is bad. It is more that I don't know beforehand and would like some way for the job itself to articulate that it is still actively working on something. From outside, that is just much harder to determine.

boredland avatar Sep 24 '25 23:09 boredland

I'm kind of working on a similar issue. Does pg boss have any kind of heartbeat mechanism on jobs in case the worker locks up, it can be marked as failed quicker than expireInSeconds if it fails to check in or something?

bcomnes avatar Oct 03 '25 05:10 bcomnes

There is a wip event you can listen to that has all the workers and their metadata. Each worker's polling loop is mostly surrounded by try/catch, and error will be emitted in the catch block for errors.

timgit avatar Oct 03 '25 15:10 timgit