procrastinate icon indicating copy to clipboard operation
procrastinate copied to clipboard

Use queuing lock as debouncer

Open ewjoachim opened this issue 5 months ago • 7 comments
trafficstars

I have a CRUD app with very frequent updates to a resource (within seconds). And I want to send an email 15 minutes after the last edit.

Am I correct in thinking I need to manually update the database and "push it back"? Or is there a built-in way to do this?

Thanks for an amazing library (and typed!).

Originally posted by @silviogutierrez in #1050

I'm not saying a decision is taken towards going this way, but I'd like for a ticket to be open so we can think about the idea of adding an option at defer time saying which element we want to keep in case of conflict.

I wonder if it would make sense to use ON CONFLICT DO UPDATE.

ewjoachim avatar Jun 13 '25 11:06 ewjoachim

I've looked around other background task libraries and this is surprisingly absent. I would have thought it's a more common requirement to only do "latest of".

silviogutierrez avatar Jun 13 '25 18:06 silviogutierrez

I've looked around other background task libraries and this is surprisingly absent. I would have thought it's a more common requirement to only do "latest of".

It's got edge cases that need to be worked out I think. Ex: What would you expect to happen when this runs but a matching job is already in a running state, but was started longer ago than the debounce window?

TkTech avatar Jun 13 '25 18:06 TkTech

Good point, though I think just keep the simplest behavior at that point: it's already running, just queue a new one.

silviogutierrez avatar Jun 13 '25 18:06 silviogutierrez

So does the new job get the minimum denounce duration, since it started longer than the window, or the maximum, since it's still running? If the task runtime exceeds the denounce window, is the queue expected to delay the next one by the start time or the eventual task end time?

Needs a state diagram to work out all the cases

TkTech avatar Jun 13 '25 20:06 TkTech

My take would be:

  • queueing locks only apply to todo jobs. Once they start running, it's out of scope
  • when a job is defered and shares a queuing lock with another job with the same queuing lock, instead of refusing the defer, we delete the older job, whatever were its attributes (even if it had a different task or different params)
  • the job in the todo state might have a scheduled_at param, or just be waiting for a slot on a worker, it's treated the same. But by deleting the old job and deferring the new one, which might come with its own scheduled_at, we're delaying the job execution until we don't defer anything for some time,

Of course, this could all be handled by user code with a periodic task: when there's a change we want to record, write a row in a table with the date. Every minute, look at all rows in that table, grouped by corresponding item, where the latest row is > 15 min ago, craft and send the email and delete the corresponding rows (as atomically as can be, ensure that the rows that get deleted are the rows that were read, not the ones that might have been created in the meantime)

ewjoachim avatar Jun 13 '25 23:06 ewjoachim

Of course, this could all be handled by user code with a periodic task: when there's a change we want to record, write a row in a table with the date. Every minute, look at all rows in that table, grouped by corresponding item, where the latest row is > 15 min ago, craft and send the email and delete the corresponding rows (as atomically as can be, ensure that the rows that get deleted are the rows that were read, not the ones that might have been created in the meantime)

That is how I would approach it. It is simple and even allows to send batch notifications.

onlyann avatar Jun 14 '25 22:06 onlyann

It is simple and even allows to send batch notifications.

Yes, I think the whole point of debouncing was to allow batches. [EDIT]: ah, I guess you mean "by using this system, you know by design what needs to be included in the batch, whereas with the simple debouncing system, you would only know that something changed, but you wouldn't know what has changed unless you store it elsewhere". Yes, that's true too.

Another nice property of this system is that if there's a bug where emails aren't sent (and as long as rows aren't being deleted, as we said, atomic), and then you fix it, it will self-repair. Whereas with debounced tasks, you'll need to somehow-manually retry the tasks.

ewjoachim avatar Jun 15 '25 20:06 ewjoachim