nhibernate-core icon indicating copy to clipboard operation
nhibernate-core copied to clipboard

InvalidOperationException on SequencedHashMap.OrderedEnumerator.MoveNext

Open DeveTho opened this issue 2 years ago • 11 comments

Every now and then, we see the following exception in our logs:

System.InvalidOperationException: Enumerator was modified
   at NHibernate.Util.SequencedHashMap.OrderedEnumerator.MoveNext()
   at NHibernate.Engine.StatefulPersistenceContext.AfterTransactionCompletion()
   at NHibernate.Impl.SessionImpl.AfterTransactionCompletion(Boolean success, ITransaction tx)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.CompleteTransaction(Boolean isCommitted)

To be honest, we have no idea where it exactly comes from or when it occurs. It's quite rare and we haven't been able to reproduce it.

However, it seems to be very similar to #1216. Is it possible this issue wasn't completely fixed?

DeveTho avatar Jul 10 '23 15:07 DeveTho

However, it seems to be very similar to https://github.com/nhibernate/nhibernate-core/issues/1216

Full stacktrace can help to answer it. Can it be a transaction timeout error in your case too?

bahusoid avatar Jul 10 '23 19:07 bahusoid

Full stacktrace can help to answer it.

That actually is the full stacktrace.

Can it be a transaction timeout error in your case too?

Possibly. Every call with the session is done inside of a TransactionScope. The transaction will often be managed by MSDTC, as we also use it to send messages via MSMQ.

However, we haven't seen any errors mentioning a transaction timeout, just this error.

DeveTho avatar Jul 10 '23 20:07 DeveTho

We just hit this or something similar on nhibernate 5.2.4. It was related to a transaction timeout.

Exception: System.InvalidOperationException
 
Message: Enumerator was modified
 
StackTrace:
   at NHibernate.Util.SequencedHashMap.OrderedEnumerator.MoveNext()
   at NHibernate.Engine.StatefulPersistenceContext.AfterTransactionCompletion()
   at NHibernate.Impl.SessionImpl.AfterTransactionCompletion(Boolean success, ITransaction tx)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.CompleteTransaction(Boolean isCommitted)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.ProcessSecondPhase(Enlistment enlistment, Nullable`1 success)
   at System.Transactions.VolatileEnlistmentAborting.EnterState(InternalEnlistment enlistment)
   at System.Transactions.TransactionStateAborted.EnterState(InternalTransaction tx)
   at System.Transactions.EnlistableStates.Timeout(InternalTransaction tx)
   at System.Transactions.Bucket.TimeoutTransactions()
   at System.Transactions.BucketSet.TimeoutTransactions()
   at System.Transactions.TransactionTable.ThreadTimer(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.TimerQueueTimer.CallCallback()
   at System.Threading.TimerQueueTimer.Fire()
   at System.Threading.TimerQueue.FireNextTimers()

kaksmet avatar Dec 14 '23 08:12 kaksmet

I have managed to create a small program that reliably triggers this bug for me when run against MS SQL Server 2022. https://github.com/kaksmet/NHibBugRepro It uses .NET 8 and NHibernate 5.4.7.

kaksmet avatar Dec 21 '23 14:12 kaksmet

So, by the look of your program, the exception occurs on some thread it has not explicitly spawned? Since the ones it spawns have a "catch all exceptions and ignore them".

This seems tied also to the use of TransactionScope. In the case of your program, it does not seem to use distributed scopes. Multi-threading troubles usually occurs with distributed scope. So, even if not distributed, a transaction scope may still trigger some processing on a new thread, at least in the case of a timeout?

This interface is more wicked than I thought then. Anyway, the thread locking mechanism added in #627 seems not enough to protect against this concurrency case. But maybe that is because the session is not properly disposed of by the code, as it is the case in the example program.

https://github.com/kaksmet/NHibBugRepro/blob/24e13a160a310dec52c6fc4a8c0b275de8230ce0/Program.cs#L42-L46

using(var txscope = new TransactionScope(TransactionScopeOption.Required, txOptions)) {
    var session = sessionFactory.OpenSession();
    var data = session.CreateCriteria<Entity>().List();
    txscope.Complete();
}

Does the trouble stay the same if this is changed as following?

using(var txscope = new TransactionScope(TransactionScopeOption.Required, txOptions))
using (var session = sessionFactory.OpenSession()) {
    var data = session.CreateCriteria<Entity>().List();
    txscope.Complete();
}

fredericDelaporte avatar Jan 21 '24 18:01 fredericDelaporte

We're also getting this error intermittently after upgrading from 4.1.2.4000 to 5.4.7, .Net Framework 4.8 (also occurs on the recently released 5.5.0). There's one particular performance-intensive process where it always happens after some random amount of time. Version 4.1.2.4000 was working fine. We're using NHibernate with MSMQ and distributed transactions.

It does not appear to be timeout-related at all. The thread that this exception occurs on is a new thread from the thread pool that has not been used by the application. We dispose the transaction scope, session and connection each time.

NHibernate.Transaction.ITransactionFactory - Failure at transaction completion

FaultExceptionRenderer
System.InvalidOperationException: Enumerator was modified
   at NHibernate.Util.SequencedHashMap.OrderedEnumerator.MoveNext()
   at NHibernate.Util.IdentityMap.get_EntryList()
   at NHibernate.Engine.StatefulPersistenceContext.Clear()
   at NHibernate.Impl.SessionImpl.AfterTransactionCompletion(Boolean success, ITransaction tx)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.CompleteTransaction(Boolean isCommitted)
   at NHibernate.Log4NetLogger.Error(Object message, Exception exception)
   at NHibernate.NHibernateLoggerThunk.Log(NHibernateLogLevel logLevel, NHibernateLogValues state, Exception exception)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.CompleteTransaction(Boolean isCommitted)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.ProcessSecondPhase(Enlistment enlistment, Nullable`1 success)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.System.Transactions.IEnlistmentNotification.Commit(Enlistment enlistment)
   at System.Transactions.VolatileEnlistmentCommitting.EnterState(InternalEnlistment enlistment)
   at System.Transactions.VolatileEnlistmentPrepared.InternalCommitted(InternalEnlistment enlistment)
   at System.Transactions.VolatileDemultiplexer.BroadcastCommitted(VolatileEnlistmentSet& volatiles)
   at System.Transactions.TransactionStatePromotedCommitted.EnterState(InternalTransaction tx)
   at System.Transactions.TransactionStatePromotedBase.ChangeStatePromotedCommitted(InternalTransaction tx)
   at System.Transactions.InternalTransaction.DistributedTransactionOutcome(InternalTransaction tx, TransactionStatus status)
   at System.Transactions.Oletx.RealOletxTransaction.FireOutcome(TransactionStatus statusArg)
   at System.Transactions.Oletx.OutcomeEnlistment.InvokeOutcomeFunction(TransactionStatus status)
   at System.Transactions.Oletx.OletxTransactionManager.ShimNotificationCallback(Object state, Boolean timeout)
   at System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(Object state, Boolean timedOut)

DavidFidge avatar Jan 22 '24 04:01 DavidFidge

Does the trouble stay the same if this is changed as following?

Yes, the same System.InvalidOperationException: Enumerator was modified still happens if the NHibernate session is properly disposed of.

This seems tied also to the use of TransactionScope. In the case of your program, it does not seem to use distributed scopes.

Yes, it's tied to the use of TransactionScope and no distributed transactions are involved in my test program.

So, by the look of your program, the exception occurs on some thread it has not explicitly spawned?

That is correct. It seems like TransactionScope internally uses a System.Threading.Timer for its timeout, and the System.Threading.Timer docs says this

Provides a mechanism for executing a method on a thread pool thread at specified intervals.

It then appears to call System.Transactions.IEnlistmentNotification.Rollback on the same (thread pool) thread and that's the first NHibernate function in the stack trace.

kaksmet avatar Jan 22 '24 07:01 kaksmet

That is quite a challenge to work with an API (TransactionScope in this case) that causes non thread safe object instances to be potentially used by multiple threads, without providing any synchronization point to allow ensuring thread safety...

So, as I said, the TransactionScope API is actually wicked. That means it is very badly designed, and we have to use crutches to try compensate for its design.

https://github.com/dotnet/runtime/blob/435e9aa552a3f4fa7348b2a18e12fd5615791302/src/libraries/System.Transactions.Local/src/System/Transactions/TransactionScope.cs#L136-L141

                // BUGBUG: Scopes should not use individual timers
                _scopeTimer = new Timer(
                    TimerCallback,
                    this,
                    scopeTimeout,
                    TimeSpan.Zero);

I am not sure I will have enough time soon to dive into this case and try find another crutch to handle it. If anyone is willing to give it a try, have a look to what is currently done in the AdoNetWithSystemTransactionFactory, especially the Wait method and its usages.

fredericDelaporte avatar Jan 22 '24 21:01 fredericDelaporte

I've done some more testing with the demo project provided by @kaksmet . The exception doesn't seem to happen with 4.1.2.4000. But then you upgrade to NHibernate 5.0.0 and the exception occurs.

https://github.com/DavidFidge/NHibernateBugTesting1 (version 4.1.2.4000) https://github.com/DavidFidge/NHibernateBugTesting1/tree/NHibernate5 (version 5.0.0) (if running these, ensure you completely clean the solution when switching)

Issue #1216 is a duplicate. That was reported from version 2 judging by its age. Not sure if it was ever "accidentally" fixed in version 3 or 4 or if it has always been a rare problem. But in version 5 it is much easier to reproduce. (edit - just saw that 1216 was already mentioned by original poster).

DavidFidge avatar Jan 24 '24 22:01 DavidFidge

Your case seems to be another case than kaksmet's one. Your case seems to fail on a transaction commit. So, isn't your case actually distributed? Have you checked your application is not using one of the troublesome patterns explained here?

fredericDelaporte avatar Jan 31 '24 16:01 fredericDelaporte

We are also seeing this error intermittently and use the TransactionScope extensively. We just upgraded from 5.1.7 to 5.5.0 and had not observed this issue in 5.1.7. We're using it with Oracle.ManagedDataAccess 21.13 and the 12c dialect on .NET Framework 4.8.

System.InvalidOperationException: Enumerator was modified
   at NHibernate.Util.SequencedHashMap.OrderedEnumerator.MoveNext()
   at NHibernate.Util.IdentityMap.get_EntryList()
   at NHibernate.Engine.StatefulPersistenceContext.Clear()
   at NHibernate.Impl.SessionImpl.AfterTransactionCompletion(Boolean success, ITransaction tx)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.CompleteTransaction(Boolean isCommitted)
   at NHibernate.Transaction.AdoNetWithSystemTransactionFactory.SystemTransactionContext.ProcessSecondPhase(Enlistment enlistment, Nullable`1 success)
   at System.Transactions.VolatileEnlistmentCommitting.EnterState(InternalEnlistment enlistment)
   at System.Transactions.VolatileDemultiplexer.BroadcastCommitted(VolatileEnlistmentSet& volatiles)
   at System.Transactions.TransactionStatePromotedCommitted.EnterState(InternalTransaction tx)
   at System.Transactions.InternalTransaction.DistributedTransactionOutcome(InternalTransaction tx, TransactionStatus status)
   at System.Transactions.Oletx.RealOletxTransaction.FireOutcome(TransactionStatus statusArg)
   at System.Transactions.Oletx.OutcomeEnlistment.InvokeOutcomeFunction(TransactionStatus status)
   at System.Transactions.Oletx.OletxTransactionManager.ShimNotificationCallback(Object state, Boolean timeout)
   at System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(Object state, Boolean timedOut)

gmf002 avatar Feb 06 '24 20:02 gmf002

Thanks for the fix - we took one of the recent nightly builds to test this fix and we're no longer seeing the exception happening. Looking forward to the 5.6 release!

DavidFidge avatar May 14 '24 07:05 DavidFidge