Hangfire.Storage.SQLite icon indicating copy to clipboard operation
Hangfire.Storage.SQLite copied to clipboard

Unable to update heartbeat - still happening in .NET 6.0

Open fnajera-rac-de opened this issue 11 months ago • 5 comments

My ASP.NET Core 6 app shows this error very often:

Unable to update heartbeat on the resource 'HangFire:xxx'. The resource is not locked or is locked by another owner.

I believe this has to do with #68, and may be gone if that error is fixed.

But regardless of #68, if SQLiteDistributedLock cannot update the heartbeat because of that message, what's the point of keep retrying? I think the timer should be stopped in that case - or at the minimum, mute the error log so that it doesn't show up indefinitely in the logs.

fnajera-rac-de avatar Mar 03 '24 17:03 fnajera-rac-de

In my case, happens when published to a Linux distribution but not in Windows

TXRock avatar Mar 13 '24 08:03 TXRock

From what I read in the past, multi-thread, multi-process access to the database in LiteDB is quite different than SQLite, maybe that is the reason LiteDB might not have the same issues as SQLite regarding the distributed lock part.

TXRock avatar Mar 13 '24 08:03 TXRock

@TXRock are you using AcquireDistributedLock in async methods?

fnajera-rac-de avatar Mar 13 '24 08:03 fnajera-rac-de

@TXRock are you using AcquireDistributedLock in async methods?

I am not using AcquireDistributedLock, but indeed my jobs are executing async methods.

But not sure how this is related with the heartbeat check.

It goes with (Hangfire.Storage.SQLite.SQLiteDistributedLock) Unable to update heartbeat on the resource 'HangFire:job:xxx:state-lock'. SQLite.SQLiteException: database is locked and later on (Hangfire.Storage.SQLite.SQLiteDistributedLock) Unable to update heartbeat on the resource 'HangFire:job:xxx:state-lock'. The resource is not locked or is locked by another owner. and could not recover.

TXRock avatar Mar 13 '24 08:03 TXRock

See #68 for an internal usage of ThreadLocal which seems incompatible with async.

The "database is locked" is probably a transaction failing and not being retried (haven't investigated that one).

But if you look at the code for the message "The resource is not locked or is locked by another owner" I think you'll find the situation described in the other ticket. I assume SQLiteDistributedLock is used also internally by the library even if you don't have explicit usages of it.

I'll see if I can get some time to add a unit test for this problem (at least in the scenario I found)

fnajera-rac-de avatar Mar 13 '24 09:03 fnajera-rac-de