bullmq
bullmq copied to clipboard
Add job.runAt property to expose timestamp that job is supposed to run at
Fixes: #1728
- feat(job): add runAt field to job key in redis
- feat(job): add runAt property to Job
- test(job): extend job tests to check runAt correctness
Thank you for the PR.
I think that it would be better to retrieve runtAt lazily, as it is difficult to keep it in sync at all times. Since runAt can be calculated from the delayed set, we can just return it when it is actually needed, for example when calling getJobs("delayed"), maybe Job could have as well an async method getRunAt() that can be called when needed.
@manast While that works for jobs that are in the delayed state, it doesn't work for calculating something like a "job wait time" metric, i.e. how long after after a job's runAt that it actually starts running, because there's no timestamp for when a job was enqueued into waiting, and once the job is removed from the delayed zset, there's no score to look at either.
I don't think storing the job's runAt on the job key and keeping it in sync is any worse than the way delay is stored on the job key today, and I think the way this PR does it, it's a more useful property and stays in sync better than job.delay does.
@hyperair could we delete the "delay" field on the job then as it does not seem to be useful? Regarding runAt, I think it could be misleading as it should not represent "when" the job will be run, it represents how much the job will wait at least before it is run, or the timestamp from which it could run, but for different reasons it could run after that, so "processedOn" and "runAt" would not be the same most of the time.
@hyperair then the other question is, should "runAt" be cleared after the job has been moved from the delayed set to the wait list?
so in a sense "runAt" would be the same as the score in the delay set, just a cached value that makes it more convenient when getting the jobs.
@manast While that works for jobs that are in the delayed state, it doesn't work for calculating something like a "job wait time" metric, i.e. how long after after a job's runAt that it actually starts running, because there's no timestamp for when a job was enqueued into
waiting, and once the job is removed from thedelayedzset, there's no score to look at either.
Ok, this seems to be quite a niche use-case, since usually you are mostly interested in the total time it took from when the job was added to a queue up to when it was completed 🤔
@manast
could we delete the "delay" field on the job then as it does not seem to be useful?
Yep, it'll be an API break, but if you're okay with it I'll drop the delay field.
Regarding runAt, I think it could be misleading as it should not represent "when" the job will be run, it represents how much the job will wait at least before it is run, or the timestamp from which it could run, but for different reasons it could run after that, so "processedOn" and "runAt" would not be the same most of the time.
Hmm, that's true. Will one of the following work, or do you have suggestions for a better name?
delayedUntil-- might not be good either since it implies that the job is delayed, which may not be the case for a job that was added without delaynotBeforescheduledFor
then the other question is, should "runAt" be cleared after the job has been moved from the delayed set to the wait list? so in a sense "runAt" would be the same as the score in the delay set, just a cached value that makes it more convenient when getting the jobs.
Nope, the idea is for runAt (or alternative name) to be the timestamp from which a job should be available for execution
- For
delayed -> waiting -> active, it'll be the same as the delayedTimestamp encoded in the score of the job - For
job.changeDelayorjob.moveToDelayed, same as above - For
waiting -> active, i.e. jobs scheduled without delay, it should be the same asjob.timestamp - For
(completed|failed) -> waiting -> active, it should be the timestamp at whichjob.retry()was called
Ok, this seems to be quite a niche use-case, since usually you are mostly interested in the total time it took from when the job was added to a queue up to when it was completed :thinking:
In my case I'm generating two timing metrics for this:
job_wait_time-- the time a job spentwaitingin queue, excluding the time beforerunAt(useful for troubleshooting queue backlog, e.g. insufficient workers to process the jobs, or jobs not getting promoted in a timely fashion..)job_execution_time-- the time a job spent actually executing the job (useful for troubleshooting other kinds of trouble, e.g. slow API calls, other kinds of bugs)
A single combined metric summing the above two durations up makes it more difficult to differentiate between these two kinds of problems, but even that's not really possible to generate with bullmq master right now.
could we delete the "delay" field on the job then as it does not seem to be useful?
Yep, it'll be an API break, but if you're okay with it I'll drop the delay field.
As it turns out, there's one purpose for delay that runAt doesn't cover -- updateParentDepsIfNeeded uses it to schedule the parent into delayed state instead of waiting
This would be a very useful addition for me as I cannot accurately tell when a job will after calling job.changeDelay
Would also love this feature since I use changeDelay all the time. Using bull board and its quite bad to not know when a job will be executed.
@lindesvard I will try to improve it somehow.