asynq
asynq copied to clipboard
Why are we cleaning the cron entries on shutdown?
We have a few different types of periodic jobs running (monthly, weekly, daily).
But every time our services (pods) restart, the progress for these jobs is lost, and it has to start from the beginning,
I was researching this and came across this code, and I wanted to know how relevant is it, and why it was intended like this. Maybe it was difficult to handle resuming so it was cut off until it was fixed.
https://github.com/hibiken/asynq/commit/6529a1e0b1289d01f13229f450a5a0904e162a2c
In general, how can we resolve losing the progress of these entries?
clearHistory ultimately clears the enqueue event history of all scheduler entries, I don't think this should be an issue as it just prevents the db from growing uncessarily. The heartbeater does clears the scheduler entries during shutdown. I am assuming the task entries are somehow utilized within your code?
Could you describe how you expected the scheduler to work? Are these long running jobs?
it just prevents the db from growing necessarily
I understand this, but maybe we should add some kind of condition on not removing the queued and unfinished jobs because we lose the progress?
I am assuming the task entries are somehow utilized within your code?
Also, what do you mean by this?
So we are leveraging this usage in asynq
Let's say for the daily periodic job, there needs to be a job that is handled after 55 minutes, the pod restarts, now it is beginning to calculate again for 24 hour, it is not going to be triggered after 55 minutes anymore.
The alternative is to use fixed times like everyday at 6:00 PM or something but it just does not suit the business needs.
There needs to be a job that is handled after 55 minutes, the pod restarts, now it is beginning to calculate again for 24 hour, it is not going to be triggered after 55 minutes anymore.
Thanks for clarifying. I'll investigate and try and reproduce this issue.
Did you find a way to prevent deletion? @Kenan7 @kamikazechaser
Unfortunately. @Haji-sudo
Same issue here, we have exactly the same use case with monthly / weekly jobs. We don't mind losing progress of ongoing jobs, as we keep track of the progress at a job level. But we would expect the job to run again on startup, as it has not been finished successfully.
This is especially visible with long-duration jobs running like so @every week or @every month. We have jobs running for hours or even days. Let's say that after 1 day the server restarts, we would have to wait for another week or month before re-running the job again.