Hangfire icon indicating copy to clipboard operation
Hangfire copied to clipboard

Hangfire Clean up

Open bladefist opened this issue 4 years ago • 14 comments

Hello,

Our hangfire sql database has grown to 30GB. We cannot reset our db to new due to a lot of scheduled jobs that we want to keep. Has anyone come up with scripts to purge old completed/failed jobs?

thanks.

bladefist avatar Jul 04 '20 15:07 bladefist

Try reclaiming white space / re-index table in online mode (or offline mode scheduled for faster performance).

houseofcat avatar Jul 20 '20 17:07 houseofcat

@houseofcat That kicking the can down the road. In reality we have millions of finished jobs that we don't need to carry forward for the rest of eternity.

bladefist avatar Jul 20 '20 17:07 bladefist

The great purge!

houseofcat avatar Jul 21 '20 01:07 houseofcat

All the background jobs in Succeeded and Deleted state are expired automatically, regular background jobs expire after 24 hours, batched jobs expire after 7 days, both settings are configurable. Counter on the Dashboard UI, such as Succeeded: 24,049,482, is just a counter and does not represent that all of those succeeded jobs are still in the storage. Hangfire was built to avoid such storage leaks.

The reasons for such a bloat can be heavily fragmented indexes, so you'll need to run index reorganization/rebuild with scripts like this. Another problem can be related to long running jobs that prevent transaction log from being truncated, so you will need to use sliding invisibility timeout fetching in the following way:

    .UseSqlServerStorage("connection_string", new SqlServerStorageOptions
    {
        SlidingInvisibilityTimeout = TimeSpan.FromMinutes(5)
    });

In this case there will be no immediate job re-queue in case of unexpected process shutdown (such as via process kill via task manager, debug session stop in VS), but long-running jobs will be processed more robustly.

odinserj avatar Jul 21 '20 08:07 odinserj

regular background jobs expire after 24 hours

Can this 24 hours be modified through configuration?

bxjg1987 avatar Jul 23 '21 06:07 bxjg1987

@bxjg1987 JobStorage has JobExpirationTimeout property that is used as default when marking jobs for expiration. You can also override that default value in IApplyStateFilter on per-job basis.

pieceofsummer avatar Jul 26 '21 12:07 pieceofsummer

Is there any way how to also clean up the counter table? It contains thousands of records, which overflows the row limits in Heroku.

I only have this config, that successfully removes the data from other tables, but doesn't work on that counter table.

services.AddHangfire(config =>
{
    config.UsePostgreSqlStorage(connectionString, new PostgreSqlStorageOptions
    {
        JobExpirationCheckInterval = TimeSpan.FromMinutes(15),
    }).WithJobExpirationTimeout(TimeSpan.FromHours(1));
});

tenhobi avatar Oct 20 '21 12:10 tenhobi

We have been running Hangfire for many years. I have observed the dashboard page doesn't open any more returning the 504 status code. I tried checking the number of columns in Hangfire tables:

Counter - 234 million
Job - 63 million
Jobparameter - 126 million
State - 189 million

Hash - 45
Jobqueue - 0 
List - 0
Lock - 1
Schema - 1
Server - 3
Set - 5

I see the tables contain succeeded job data. Is it safe to delete it from the database directly? I can see successful jobs which are many months old. I guess they should have been deleted, but they are still there.

adnan-kamili avatar Mar 04 '22 12:03 adnan-kamili

Please show me your configuration code related to Hangfire to understand why jobs aren’t deleted and tell me what version you are using. Also I see that counter aggregator component isn’t working so we need to configure logging to understand what’s going on https://docs.hangfire.io/en/latest/configuration/configuring-logging.html

odinserj avatar Mar 04 '22 12:03 odinserj

Here is the hangfire version (using ASP.NET 5):

    <PackageReference Include="Hangfire.AspNetCore" Version="1.7.27" />
    <PackageReference Include="Hangfire.PostgreSql.ahydrax" Version="1.7.4" />
services.AddHangfire(config => config.UsePostgreSqlStorage($"{Configuration.Get<AppOptions>().Database.Url};Search Path=hangfire"));
var hangfireOptions = new BackgroundJobServerOptions();
app.UseHangfireServer(hangfireOptions);

Mostly the jobs are enqueued like this:

BackgroundJob.Enqueue<WebhookService>(j => j.InvokeWebhookJob(eventType, data));

The above function looks like this:

[AutomaticRetry(Attempts = 0, OnAttemptsExceeded = AttemptsExceededAction.Delete)]
 public async Task InvokeWebhookJob(string eventType, string data)
{
   ...
}

adnan-kamili avatar Mar 04 '22 13:03 adnan-kamili

select count(*) from hangfire.counter where expireat is not null and expireat < now()

returns

53930

adnan-kamili avatar Mar 04 '22 13:03 adnan-kamili

Thank you for the information. Please note that this is storage-related issue. Since you are using Hangfire.PostgreSql.ahydrax (and not Hangfire.SqlServer), please raise the issue at their repository – https://github.com/ahydrax/Hangfire.PostgreSql.

odinserj avatar Mar 04 '22 13:03 odinserj

I checked the Aurora Postgres logs, I do see these queries regularly:

Screenshot 2022-03-04 at 7 01 00 PM

adnan-kamili avatar Mar 04 '22 13:03 adnan-kamili

ok thanks

adnan-kamili avatar Mar 04 '22 13:03 adnan-kamili

regular background jobs expire after 24 hours

Can this 24 hours be modified through configuration?

@odinserj With v1.8, how do we change this from 24 hour to 6 hours?

johnwc avatar Aug 11 '23 06:08 johnwc