bull icon indicating copy to clipboard operation
bull copied to clipboard

Set the failed/completed jobs automatically removed after a period time

Open Tin-Nguyen opened this issue 7 years ago • 13 comments

Hi all,

Do we have an option in Bull allows to remove the failed/completed jobs in queue after a period time (1 day/week/month)? I think it's necessary because it can reduce the number items in Redis storage.

Thanks in advanced. Tin

Tin-Nguyen avatar Feb 05 '18 14:02 Tin-Nguyen

For what it's worth, we just run queue.clean(3600 * 1000, "completed") once a minute and this means we only have an hours worth of completed tasks at any point in time.

mscharley avatar Feb 20 '18 10:02 mscharley

@mscharley thanks. But I'm wondering why we don't have a function to let us be able to set a permanent process for the queue to remove the jobs after a period time? Could we consider to have an improvement on this?

Tin-Nguyen avatar Feb 20 '18 15:02 Tin-Nguyen

Sure, and I 👍 'd this issue too. Just letting people know of workarounds in the meantime.

mscharley avatar Feb 20 '18 20:02 mscharley

We store our job details elsewhere so this is just duplicated data being stored. This is a much wanted feature from me as well :)

aleccool213 avatar Dec 11 '18 16:12 aleccool213

@mscharley I wanted to clear jobs from failed state and I tried queue.clean(3600 * 1000, "failed") and it didn't work. What's the right way?

sriharshakappala avatar Apr 23 '19 10:04 sriharshakappala

Without this feature the amount of data in this increases to infinity, unless one of the workarounds is implemented of course.

It would be really great if it was possible to retain statistics (number of jobs processed), without having to keep the actual job details around.

axos88 avatar Apr 05 '20 13:04 axos88

There is a removeOnCompleted & removeOnFailed option for each job that will allow removing once the job is complete or failed. This should work in doing what needs to be done.

As for retaining statistics for jobs without keeping the jobs around, I think that's relatively easy to do in userland by tracking the 'completed' and 'failed' events on the queue. Just keep increment some value in redis.

addisonElliott avatar Apr 05 '20 22:04 addisonElliott

The removeOnCompleted and removeOnFailed would remove the information immediately. I'm looking for a solution that would keep that around for an amount of time, so that they can be reviewed for say 7 days, and then scrapped so they don't clog up the redis storage.

The problem with implementing the statistics in userland is that there is already statistics provided by bull, and that info would become incorrect when deleting (old) jobs.

axos88 avatar Apr 15 '20 13:04 axos88

@axos88 it does not remove it immediately, please check the documentation: https://github.com/OptimalBits/bull/blob/develop/REFERENCE.md#queueadd

manast avatar Apr 15 '20 17:04 manast

@manast you can keep around "X" number of items, but cannot set a time limit for them. If I want to keep them around for 1 week, and use a workaround of setting removeOnCompleted to say 5, and there are only one job per week of that type i will end up keeping them around for 5 weeks instead. If there are one job per hour, i will end up keeping them around for only 5 hours. This workaround works if you have existing knowledge of how often the job will run and can determine an accurate guess of how many occurances to keep around, such as a scheduled cronjob, but does not work for something that is triggered by your users' behaviour, such as sending a message, or placing and order.

But you're right in that what I am gunning here is actualy to extend that syntax to allow boolean | number | Date.

axos88 avatar Apr 16 '20 09:04 axos88

yes, however Date is also a sword with two edges, if you retain based on date you may end filling up all redis memory.

manast avatar Apr 16 '20 09:04 manast

@manast, Well yes, of course, but that's a decision that needs to be weighed by the user/programmer and the approariate action needs to be taken. In my case I know that I can hold the data of a week in memory, but would run out if left unchecked for months or weeks.

And of course the two can be combined saying hold the data of a week, but up to maximum 100 jobs, or the other way around keep up to 100 jobs in redis, but delete them after a week, even if there are less than 100 to clean up unuseful data.

axos88 avatar Apr 16 '20 12:04 axos88

For the reference, this seems to be implemented, if I understand this issue correctly. https://github.com/OptimalBits/bull/blob/develop/REFERENCE.md#keepjobs-options, https://github.com/OptimalBits/bull/blob/4b8a386acedcec6da137f9bb810170c3a8eb11d2/lib/commands/moveToFinished-9.lua#L119-L132

{
    removeOnComplete: { age: 86400 * 5 }, // remove completed jobs after 5 days
}

wodCZ avatar Jun 29 '23 14:06 wodCZ