procrastinate
procrastinate copied to clipboard
Use queuing lock as debouncer
I have a CRUD app with very frequent updates to a resource (within seconds). And I want to send an email 15 minutes after the last edit.
Am I correct in thinking I need to manually update the database and "push it back"? Or is there a built-in way to do this?
Thanks for an amazing library (and typed!).
Originally posted by @silviogutierrez in #1050
I'm not saying a decision is taken towards going this way, but I'd like for a ticket to be open so we can think about the idea of adding an option at defer time saying which element we want to keep in case of conflict.
I wonder if it would make sense to use ON CONFLICT DO UPDATE.
I've looked around other background task libraries and this is surprisingly absent. I would have thought it's a more common requirement to only do "latest of".
I've looked around other background task libraries and this is surprisingly absent. I would have thought it's a more common requirement to only do "latest of".
It's got edge cases that need to be worked out I think. Ex: What would you expect to happen when this runs but a matching job is already in a running state, but was started longer ago than the debounce window?
Good point, though I think just keep the simplest behavior at that point: it's already running, just queue a new one.
So does the new job get the minimum denounce duration, since it started longer than the window, or the maximum, since it's still running? If the task runtime exceeds the denounce window, is the queue expected to delay the next one by the start time or the eventual task end time?
Needs a state diagram to work out all the cases
My take would be:
- queueing locks only apply to
todojobs. Once they start running, it's out of scope - when a job is defered and shares a queuing lock with another job with the same queuing lock, instead of refusing the defer, we delete the older job, whatever were its attributes (even if it had a different task or different params)
- the job in the todo state might have a scheduled_at param, or just be waiting for a slot on a worker, it's treated the same. But by deleting the old job and deferring the new one, which might come with its own scheduled_at, we're delaying the job execution until we don't defer anything for some time,
Of course, this could all be handled by user code with a periodic task: when there's a change we want to record, write a row in a table with the date. Every minute, look at all rows in that table, grouped by corresponding item, where the latest row is > 15 min ago, craft and send the email and delete the corresponding rows (as atomically as can be, ensure that the rows that get deleted are the rows that were read, not the ones that might have been created in the meantime)
Of course, this could all be handled by user code with a periodic task: when there's a change we want to record, write a row in a table with the date. Every minute, look at all rows in that table, grouped by corresponding item, where the latest row is > 15 min ago, craft and send the email and delete the corresponding rows (as atomically as can be, ensure that the rows that get deleted are the rows that were read, not the ones that might have been created in the meantime)
That is how I would approach it. It is simple and even allows to send batch notifications.
It is simple and even allows to send batch notifications.
Yes, I think the whole point of debouncing was to allow batches. [EDIT]: ah, I guess you mean "by using this system, you know by design what needs to be included in the batch, whereas with the simple debouncing system, you would only know that something changed, but you wouldn't know what has changed unless you store it elsewhere". Yes, that's true too.
Another nice property of this system is that if there's a bug where emails aren't sent (and as long as rows aren't being deleted, as we said, atomic), and then you fix it, it will self-repair. Whereas with debounced tasks, you'll need to somehow-manually retry the tasks.