Hangfire.PostgreSql
Hangfire.PostgreSql copied to clipboard
Could not place a lock on the resource: Lock timeout
It seems that you already fixed this issue in version 1.4.0, but we're getting a lot of errors like this every day, Not sure what is causing this, could it be connected with Npgsql package or maybe something else?
Hangfire.PostgreSql.PostgreSqlDistributedLockException: Could not place a lock on the resource 'HangFire:recurring-jobs:lock': Lock timeout.
at Hangfire.PostgreSql.PostgreSqlDistributedLock.PostgreSqlDistributedLock_Init_Transaction(String resource, TimeSpan timeout, IDbConnection connection, PostgreSqlStorageOptions options)
at Hangfire.PostgreSql.PostgreSqlDistributedLock..ctor(String resource, TimeSpan timeout, IDbConnection connection, PostgreSqlStorageOptions options)
at Hangfire.PostgreSql.PostgreSqlConnection.AcquireDistributedLock(String resource, TimeSpan timeout)
at Hangfire.Server.RecurringJobScheduler.UseConnectionDistributedLock(JobStorage storage, Func`2 action)
at Hangfire.Server.RecurringJobScheduler.EnqueueNextRecurringJobs(BackgroundProcessContext context)
at Hangfire.Server.RecurringJobScheduler.Execute(BackgroundProcessContext context)
at Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(Guid executionId, Object state)
at Hangfire.Processing.BackgroundExecution.Run(Action`2 callback, Object state)
We have the latest version of the Hangfire.PostgreSql package: 1.6.0 and Npgsql.EntityFrameworkCore.PostgreSQL version 2.1.2.
Is there any known compatibility problem between those two?
If you need any more specific information I'll try to add it.
Hello,
To avoid pushing the same recurring job more than once, a lock is created with a timeout. When another lock with the same resource name exists, the timeout expires and you receive this error. Please check the source code for more information: RecurringJobScheduler.cs
I think it is because the PostgreSqlDistributedLockException should extend from DistributedLockTimeoutException so that lock exceptions are handled.
@urossavelj Locks usually remain in the database when the application is not gracefully shutdown. If you are debugging locally, make sure to close the application from the CMD window if it is a console or the browser in case your server is in a web application.
So this is expected behaviour and nothing to worry about?
We're also having issue #111 ,will that be fixed soon?
Encountering the same issue, please help
Emptying the lock table from hangfire database kinda "fixes" the issue.
The problem is that 'HangFire:recurring-jobs:lock' blocks the addition of jobs to the queue. Its not about any job its about RecurringJobScheduler itself. Here is an linqpad example, after couple of minutes i have this lock and jobs not queued anymore until default lock timeout of 10 minues is passed. hangfire test.zip
Looks like, in my case, lock is not released because of timeout, which I do not know what to do with
8 02:05:10.5794 | Hangfire.Server.RecurringJobScheduler | 35 | TRACE | Recurring job 'CreateSites' is being updated. RecurringJob: (Queue:default;Cron:* * * * *;TimeZoneId:UTC;Job:{"Type":"***","Method":"CreateSites","ParameterTypes":"[]","Arguments":"[]"};CreatedAt:2019-09-13T18:50:38.6914661Z;V:2;Running:yes;LastJobId:833806;LastExecution:2020-03-27T23:04:09.8281272Z;NextExecution:2020-03-27T23:05:00.0000000Z), Changes: (LastExecution:2020-03-27T23:05:10.4896455Z;NextExecution:2020-03-27T23:06:00.0000000Z;LastJobId:833810), NextExecution: (27.03.2020 23:06:00)
2020-03-28 02:05:10.6947 | Hangfire.Server.RecurringJobScheduler | 35 | WARN | Recurring job 'CreateSites' can not be scheduled due to an exception.
System.Transactions.TransactionAbortedException: The transaction has aborted. ---> System.TimeoutException: Transaction Timeout
--- End of inner exception stack trace ---
at System.Transactions.TransactionStateAborted.BeginCommit(InternalTransaction tx, Boolean asyncCommit, AsyncCallback asyncCallback, Object asyncState)
at System.Transactions.CommittableTransaction.Commit()
at System.Transactions.TransactionScope.InternalDispose()
at System.Transactions.TransactionScope.Dispose()
at Hangfire.PostgreSql.PostgreSqlWriteOnlyTransaction.Commit()
at Hangfire.Server.RecurringJobScheduler.EnqueueBackgroundJob(BackgroundProcessContext context, IStorageConnection connection, String recurringJobId, DateTime now)
at Hangfire.Server.RecurringJobScheduler.TryEnqueueBackgroundJob(BackgroundProcessContext context, IStorageConnection connection, String recurringJobId, DateTime now)
@davidroth Can you tell why TransactionScope usage was so hardcoded in PostgreSqlWriteOnlyTransaction, even if entity framework devs (as i understand it) advised to use native transactions? https://docs.microsoft.com/en-us/ef/ef6/saving/transactions It will throw exceptions if you issue any DDL and have not enabled distributed transactions through the MSDTC Service. We recommend using the approach outlined in the previous sections instead where possible. And its ok that default timeout used there TransactionSynchronisationTimeout so small, 500ms? There is no chance now to avoid usage of this when i dont want to use it? Main reason is i want to avoid MSDTC usage. May be its better, by default, when EnableTransactionScopeEnlistment is false to use old (1.6.3) native npgsql transaction logic?
@xklonx If EnableTransactionScopeEnlistment is set to false (which is the default for backwards compat reasons), the transaction scope is created using TransactionScopeOption.RequiresNew.
This means that a new transaction is used and therefore there is no enlisting in an existing ambient transaction (if any).
https://docs.microsoft.com/en-us/ef/ef6/saving/transactions It will throw exceptions if you issue any DDL and have not enabled distributed transactions through the MSDTC Service.
I dont know what the implementation of PostgreSqlWriteOnlyTransaction has to do with EF6?
And its ok that default timeout used there TransactionSynchronisationTimeout so small, 500ms? There is no chance now to avoid usage of this when i dont want to use it? Main reason is i want to avoid MSDTC usage.
The timeout is configurable via TransactionSynchronisationTimeout. The timeout was already there before the TransactionScope has been added. See https://github.com/frankhommers/Hangfire.PostgreSql/blob/master/src/Hangfire.PostgreSql/PostgreSqlStorageOptions.cs#L38
I did not come up with the 500ms.
Main reason is i want to avoid MSDTC usage.
Avoiding MSDTC is great. Yet i dont know what your issue is with the current implementation and how this affects your app. Please show me how the current implementation of PostgreSqlWriteOnlyTransaction triggers MSDTC escalation in your application.
May be its better, by default, when EnableTransactionScopeEnlistment is false to use old (1.6.3) native npgsql transaction logic?
IMO no because it would add additional code complexity without any benefit.
This means that a new System Transaction (not postgres transaction) is used even if i dont need it and it uses new timeout of 500ms and gives me timeout errors (https://github.com/frankhommers/Hangfire.PostgreSql/issues/119#issuecomment-605337122) when RecurringJobScheduler tries to update a job, because of that, lock HangFire:recurring-jobs:lock will remain and dont give a chance for RecurringJobScheduler to schedule new recurring jobs (example https://github.com/frankhommers/Hangfire.PostgreSql/issues/119#issuecomment-605337122). Before that change TransactionSynchronisationTimeout was used only in other class
if(executionTimer.Elapsed > _options.TransactionSynchronisationTimeout)
throw new TimeoutException("SetRangeInHash experienced timeout while trying to execute transaction");
which was executed only after first try, nevermind how long it was.
Now that new system transaction is always exists after using (var transaction = new TransactionScope(scopeOption, transactionOptions)), you can check it in linqpad by
var transactionOptions = new TransactionOptions()
{
IsolationLevel = System.Transactions.IsolationLevel.RepeatableRead,
Timeout = TimeSpan.FromMilliseconds(500)
};
using (var transaction = new TransactionScope(TransactionScopeOption.RequiresNew, transactionOptions))
{
Transaction.Current.Dump();
}
connection always enlisted by call _connection.EnlistTransaction(Transaction.Current);
code of EnlistTransaction
public override void EnlistTransaction(Transaction transaction)
#nullable restore
{
if (EnlistedTransaction != null)
{
if (EnlistedTransaction.Equals(transaction))
return;
try
{
if (EnlistedTransaction.TransactionInformation.Status == System.Transactions.TransactionStatus.Active)
throw new InvalidOperationException($"Already enlisted to transaction (localid={EnlistedTransaction.TransactionInformation.LocalIdentifier})");
}
catch (ObjectDisposedException)
{
// The MSDTC 2nd phase is asynchronous, so we may end up checking the TransactionInformation on
// a disposed transaction. To be extra safe we catch that, and understand that the transaction
// has ended - no problem for reenlisting.
}
}
var connector = CheckReadyAndGetConnector();
EnlistedTransaction = transaction;
if (transaction == null)
return;
// Until #1378 is implemented, we have no recovery, and so no need to enlist as a durable resource manager
// (or as promotable single phase).
// Note that even when #1378 is implemented in some way, we should check for mono and go volatile in any case -
// distributed transactions aren't supported.
transaction.EnlistVolatile(new VolatileResourceManager(this, transaction), EnlistmentOptions.None);
Log.Debug($"Enlisted volatile resource manager (localid={transaction.TransactionInformation.LocalIdentifier})", connector.Id);
}
PostgreSqlWriteOnlyTransaction has nothin to do with ef but by default it must be based on npgsql transactions, like it was before, and now it not. You can check here https://stackoverflow.com/questions/1690892/transactionscope-automatically-escalating-to-msdtc-on-some-machines for example, when transactionscope escalates dtc usage. Simple way is just to move your database to another server. I dont understand why all the people who dont need transactionscope must deal with this. It can be real benefit for users to not deal with those timeout exceptions and dtc escalations. @frankhommers can you tell what you think about that?
@xklonx
This means that a new System Transaction (not postgres transaction) is used even if i dont need it and it uses new timeout of 500ms and gives me timeout errors (#119 (comment)) when RecurringJobScheduler tries to update a job, because of that, lock HangFire:recurring-jobs:lock will remain and dont give a chance for RecurringJobScheduler to schedule new recurring jobs (example #119 (comment)). Before that change TransactionSynchronisationTimeout was used only in other class
Well then we have a timeout issue. This has nothing todo with MSDTC. We could increase the timeout here so that it matches the previous timeout (which was the default npgslconnection timeout).
PostgreSqlWriteOnlyTransaction has nothin to do with ef but by default it must be based on npgsql transactions, like it was before, and now it not.
TransactionScope is only the facade. During enlistment a NpgsqlTransaction is used underneath.
You can check here https://stackoverflow.com/questions/1690892/transactionscope-automatically-escalating-to-msdtc-on-some-machines for example, when transactionscope escalates dtc usage.
I still dont understand why you are talking about MSDTC. The post behind the link is all about SqlConnection/SqlServer/Windows. You can hit MSDTC escalations when using Sql-Server on windows and spanning a transaction scope over multiple active connections. Since this issue is about the postgres adapter MSDTC is not a thing here. Sure, distributed transaction escalation exists also in this world, but here it is called "prepared transactions" (https://www.postgresql.org/docs/8.4/sql-prepare-transaction.html).
You still have not demonstrated that a prepared transaction escalation occurs with the current code.
It can be real benefit for users to not deal with those timeout exceptions and dtc escalations.
If the timeout is a problem we could increase/remove it. But this has nothing to do with msdtc.
Now i see that VolatileResourceManager uses _localTx = connection.BeginTransaction(ConvertIsolationLevel(_transaction.IsolationLevel)); and hangfire.sqlserver uses transactionscope too. As hangfire do all the stuff in separate threads it would be hard to intervene and escalate to distributed transaction. But i still dont understand decision to use TransactionScope.
The timeout should definitely be increased to that used in hangfire.sqlserver TransactionTimeout = TimeSpan.FromMinutes(1);
Yep the timeout can be increased. That should help with your issue.
MR created: https://github.com/frankhommers/Hangfire.PostgreSql/pull/147
I've merged the PR. It's included in the 1.6.4.2 version. Does it help? Can we close the issue?
For me, yes, it would be interesting to know someone else it helped or not.
Please note that this issue was created long before TransactionScope support has been added.
So Hangfire.PostgreSql.PostgreSqlDistributedLockException isnt something that was introduced with TransactionScope support. It was just more likely to hit the exception with the new code because of the 500ms timeout (until MR #147).
I am using 1.6.4.2 and i face the same issue, is there anyway to remove the lock? This is preventing me to queue new jobs.
You can remove it in lock table. Check in your logs due to which error the lock was not released, because it will surely be back again.
Hello! how remove all lock on startup? Because server can't start with lock. Hangfire.PostgreSql.PostgreSqlDistributedLockException: 'Could not place a lock on the resource 'HangFire:lock:recurring-job:Notifications': Lock timeout.'
It's above your question:
Hello! how remove all lock on startup? Because server can't start with lock. Hangfire.PostgreSql.PostgreSqlDistributedLockException: 'Could not place a lock on the resource 'HangFire🔒recurring-job:Notifications': Lock timeout.'
This:
You can remove it in lock table. Check in your logs due to which error the lock was not released, because it will surely be back again.
It seems fixed now.
We have latest versions (1.7.19) and Hangfire.PostgreSQL (1.8.1) still same issues. Anybody have an idea how we can fix this? It's odd, it just suddenly started for us on several independent systems
We have latest versions (1.7.19) and Hangfire.PostgreSQL (1.8.1) still same issues. Anybody have an idea how we can fix this? It's odd, it just suddenly started for us on several independent systems
Same, not fixed on our side.
Downgrade Hangfire to version (1.7.12) and Hangfire.PostgreSql to version (1.7.1). It works with dotnet 5!
I am running Hangfire.Postgres v1.8.2 and Hangfire.Core v1.7.22 and this is an issue when using TransactionScope. @frankhommers would you mind re-opening this issue? This a pretty big problem if we cannot enqueue work into Hangfire and update application state within a transaction.
@frankhommers I'm seeing this too, we should reopen this issue.
Same issue here.
Same issue
I get this on the latest release:
Hangfire.PostgreSql.PostgreSqlDistributedLock - HangFire:recurring-jobs:lock: Failed to lock with transaction 40001: could not serialize access due to concurrent update
Related to the recent changes / releases? @vytautask
Related to the recent changes / releases? @vytautask
I do not think so. At least not my changes :)
Let's continue discussion in #191 as I have high hopes this could be solved via already proposed mechanism (only PR is missing 🙄 ).