MySqlConnector icon indicating copy to clipboard operation
MySqlConnector copied to clipboard

The given key was not present in the dictionary

Open carlos-ferreira opened this issue 6 years ago • 25 comments
trafficstars

We're having an issue on a Windows Service on a transaction with a loop inside.

On this loop we're:

  • Making an insert;
  • Sending an Email using AWS SES;
  • Updating the inserted record;

After the 18th or 19th insert (not consistent), we get the following exception:

System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at MySql.Data.MySqlClient.MySqlConnection.<DoCloseAsync>d__122.MoveNext() in C:\projects\mysqlconnector\src\MySqlConnector\MySql.Data.MySqlClient\MySqlConnection.cs:line 732
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MySql.Data.MySqlClient.MySqlConnection.Close() in C:\projects\mysqlconnector\src\MySqlConnector\MySql.Data.MySqlClient\MySqlConnection.cs:line 227
   at Dapper.SqlMapper.ExecuteCommand(IDbConnection cnn, CommandDefinition& command, Action`2 paramReader) in C:\projects\dapper\Dapper\SqlMapper.cs:line 2803
   at Dapper.SqlMapper.ExecuteImpl(IDbConnection cnn, CommandDefinition& command) in C:\projects\dapper\Dapper\SqlMapper.cs:line 568
   at Dapper.SqlMapper.Execute(IDbConnection cnn, String sql, Object param, IDbTransaction transaction, Nullable`1 commandTimeout, Nullable`1 commandType) in C:\projects\dapper\Dapper\SqlMapper.cs:line 441

I ran a copy of the database on a local mysql instance on my machine. There are zero issues on MySQL logs, and on the MySQLConnector logs as well.

Currently we mitigated the issue by moving the transaction inside the loop and it's working fine now. Still, i though it would be a good idea to report the issue.

carlos-ferreira avatar Oct 21 '19 16:10 carlos-ferreira

It looks like this might be related to distributed transactions; are you using TransactionScope in your code?

bgrainger avatar Oct 21 '19 17:10 bgrainger

Yes, you are correct. But on our case we only have one server. We're using a transaction scope so we don't have to use the database specific transaction objects (ex: MySqlTransaction)

carlos-ferreira avatar Oct 21 '19 17:10 carlos-ferreira

It seems like it might be some kind of race condition, but I'm not yet sure how it happens. Is your code structured like this?

using (new TransactionScope())
{
	for (...)
	{
		using (var connection = new MySqlConnection())
		{
			connection.Execute("insert");
			// send email
			connection.Execute("update");
		}
	}
}

Do you have multiple threads involved, or use of async methods?

bgrainger avatar Oct 21 '19 17:10 bgrainger

It's something like this :

using (var transaction=new TransactionScope()) { for (...) {

connection.Execute("insert"); // AmazonSESClient.sendRawEmail connection.Execute("update);

}

transaction.Complete(); }

Zero async or manual threads.

carlos-ferreira avatar Oct 21 '19 19:10 carlos-ferreira

I ran a copy of the database on a local mysql instance on my machine.

Are you able to reproduce this problem locally? If so, does it depend on sending an email, or is there a self-contained repro that just uses MySQL Server?

bgrainger avatar Oct 23 '19 18:10 bgrainger

The transaction is added to s_transactionConnections in EnlistTransaction, which is called from Open. (I assume that Dapper is opening the connection because it appears to be closing it in the call stack you gave.)

The transaction is assumed to be present in s_transactionConnections in DoCloseAsync (which is called from Close). This is where the exception is occurring, because it's not present.

The transaction is only removed from s_transactionConnections in UnenlistTransaction, which is called when the transaction is committed or rolled back, yet this isn't supposed to be happening until after the end of the loop.

Can you think of anything that could be modifying the state of the transaction while the loop is running? Is it being shared with another thread or request?

bgrainger avatar Oct 23 '19 19:10 bgrainger

Something is rolling back the transaction

image

But i don't know what it is. I believe with a high level of confidence that is the email sending that is doing this, unfortunately i can't tell why as i don't have exceptions on any kind of erros when this happens.

carlos-ferreira avatar Oct 24 '19 13:10 carlos-ferreira

Full call stack

image

carlos-ferreira avatar Oct 24 '19 13:10 carlos-ferreira

From the callstack, I assume the Transaction has some kind of timeout associated with it, and automatically rolls back after that timeout. That would explain why it happens after 18 or 19 INSERTs, assuming email sending involves a fairly consistent delay (and why using shorter transactions works around the problem).

This also gives me a pointer for trying to reproduce the problem.

bgrainger avatar Oct 24 '19 14:10 bgrainger

Created a unit test to reproduce the issue, something like this:

using (var transaction = new TransactionScope())
{
	for (int i = 0; i < 40; i++)
	{
		var databaseRecord = new DatabaseRecord()
		{
			UUID = Guid.NewGuid()
		};

		System.Threading.Thread.Sleep(5000);

		repository.InsertDatabaseRecord(databaseRecord);
	}

	transaction.Complete();
}

When i reach the timeout i get:

image

This is really weird.

carlos-ferreira avatar Oct 24 '19 14:10 carlos-ferreira

I wonder if this could be related to race conditions noted in the link at https://github.com/mysql-net/MySqlConnector/issues/748#issuecomment-583758500.

A more robust suite of unit tests may be required.

bgrainger avatar Feb 08 '20 22:02 bgrainger

A CI build just failed with the same error:

[xUnit.net 00:01:19.51]     SideBySide.TransactionScopeTests.CommandBehaviorCloseConnection(connectionString: "UseXaTransactions=True") [FAIL]
  X SideBySide.TransactionScopeTests.CommandBehaviorCloseConnection(connectionString: "UseXaTransactions=True") [1m 11s]
  Error Message:
   System.Collections.Generic.KeyNotFoundException : The given key 'System.Transactions.Transaction' was not present in the dictionary.
  Stack Trace:
     at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at MySqlConnector.MySqlConnection.DoCloseAsync(Boolean changeState, IOBehavior ioBehavior) in /_/src/MySqlConnector/MySqlConnection.cs:line 905
   at MySqlConnector.MySqlConnection.Dispose(Boolean disposing) in /_/src/MySqlConnector/MySqlConnection.cs:line 562
   at SideBySide.TransactionScopeTests.CommandBehaviorCloseConnection(String connectionString) in d:\a\1\s\tests\SideBySide\TransactionScopeTests.cs:line 656
--- End of stack trace from previous location where exception was thrown ---

bgrainger avatar Jun 17 '20 16:06 bgrainger

FYI, I can confirm that mine is unrelated to timeouts: it throws less than 40 ms after the preceding log entry (in the same serial flow).

Timovzl avatar Jun 01 '21 14:06 Timovzl

I have created some log output to help with the search.

I have replaced the nuget package reference by a direct project reference, and logged wherever either:

  • m_enlistedTransaction was being set, or
  • s_transactionConnections was being mutated.

Timestamps are from bottom to top. Note how the first run of my Quartz job has no problems, but the second run (after a few minutes of inactivity) seems to think that it can reuse some existing connection.

Unfortunately, I've logged the hash code of the enlisted transaction rather than the underlying System.Transactions.Transaction.

My comments should help provide some insight into what's going on. Note that the first run performs a couple of (unrelated) transactions back-to-back, successfully, but the second run throws on its very first attempt.

@bgrainger Any idea why FindExistingEnlistedSession() in OpenAsync() finds an existing connection on my subsequent job run, for a brand new transaction?

I could only find the .NET Framework source, but it looks like the hash code and even equality of Transaction is based entirely on its InternalTransaction.TransactionHash. Perhaps such hash codes end up getting reused, and the new transaction is seen as identical to an old one? Then again, the old one is being removed from the dictionary, so it still doesn't explain.

SUBSEQUENT JOB RUN:

	// Step 3: Throw when attempting to find the related Transaction in s_transactionConnections
	Jun 1, 2021 @ 17:10:22.550	System.Collections.Generic.KeyNotFoundException: The given key 'System.Transactions.Transaction' was not present in the dictionary.
	   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
	   at MySqlConnector.MySqlConnection.DoCloseAsync(Boolean changeState, IOBehavior ioBehavior) in /app/MySqlConnector/MySqlConnection.cs:line 1004
	   at MySqlConnector.MySqlConnection.CloseAsync(Boolean changeState, IOBehavior ioBehavior) in /app/MySqlConnector/MySqlConnection.cs:line 972
	   at MySqlConnector.MySqlConnection.Dispose(Boolean disposing) in /app/MySqlConnector/MySqlConnection.cs:line 653
	   at System.ComponentModel.Component.Dispose()
	// Step 2: Close after the first of our two queries within our transaction, passing the enlistment on to a new dummy 
	Jun 1, 2021 @ 17:10:22.545	message:DoCloseAsync: Setting (stealing) m_enlistedTransaction: transaction 41727345 for connection 17882254. 
	Jun 1, 2021 @ 17:10:22.543	message:DoCloseAsync: Unsetting (donating) m_enlistedTransaction: transaction 41727345 for connection 20809629. 
	// Step 1: Enlist, which does not add to s_transactionConnections, presumably because the "TryGetValue" condition concludes the dictionary already has that transaction
	Jun 1, 2021 @ 17:10:22.524	message:EnlistTransaction: Setting m_enlistedTransaction: transaction 1 for connection 20809629.

FIRST JOB RUN:

	// Second iteration of open transaction scope, perform query, commit:

	// Step 5: Unenlist at the end of the transaction, which also removes from s_transactionConnections
	Jun 1, 2021 @ 16:41:39.451	message:UnenlistTransaction: Removing from s_transactionConnections: transaction 11 for connection 33771145. 
	Jun 1, 2021 @ 16:41:39.449	message:UnenlistTransaction: Unsetting m_enlistedTransaction: transaction 11 for connection 33771145. 
	// Step 4: Close after the second of our two queries within our transaction, passing the enlistment on to a new dummy transaction (used to hold the session)
	Jun 1, 2021 @ 16:41:39.410	message:DoCloseAsync: Setting (stealing) m_enlistedTransaction: transaction 35287174 for connection 33771145. 
	Jun 1, 2021 @ 16:41:39.408	message:DoCloseAsync: Unsetting (donating) m_enlistedTransaction: transaction 35287174 for connection 52697953. 
	// Step 3: Open the new connection, but actually reuse the previous one within the transaction
	Jun 1, 2021 @ 16:41:39.258	message:OpenAsync: Setting (stealing) m_enlistedTransaction: transaction 35287174 for connection 52697953. 
	Jun 1, 2021 @ 16:41:39.257	message:OpenAsync: Unsetting (donating) m_enlistedTransaction: transaction 35287174 for connection 44419000. 
	// Step 2: Close after the first of our two queries within our transaction, passing the enlistment on to a new dummy transaction (used to hold the session)
	Jun 1, 2021 @ 16:41:39.254	message:DoCloseAsync: Setting (stealing) m_enlistedTransaction: transaction 35287174 for connection 44419000. 
	Jun 1, 2021 @ 16:41:39.253	message:DoCloseAsync: Unsetting (donating) m_enlistedTransaction: transaction 35287174 for connection 41300193. 
	// Step 1: Enlist (caused by OpenAsync), which also adds to s_transactionConnections
	Jun 1, 2021 @ 16:41:39.190	message:EnlistTransaction: Adding to s_transactionConnections: transaction 11 for connection 41300193. 
	Jun 1, 2021 @ 16:41:39.133	message:EnlistTransaction: Setting m_enlistedTransaction: transaction 11 for connection 41300193. 

	// First iteration of open transaction scope, perform query, commit:
	
	Jun 1, 2021 @ 16:41:39.108	message:UnenlistTransaction: Removing from s_transactionConnections: transaction 10 for connection 9779853. 
	Jun 1, 2021 @ 16:41:39.106	message:UnenlistTransaction: Unsetting m_enlistedTransaction: transaction 10 for connection 9779853. 
	Jun 1, 2021 @ 16:41:39.018	message:DoCloseAsync: Setting (stealing) m_enlistedTransaction: transaction 17561922 for connection 9779853. 
	Jun 1, 2021 @ 16:41:39.017	message:DoCloseAsync: Unsetting (donating) m_enlistedTransaction: transaction 17561922 for connection 60800190. 
	Jun 1, 2021 @ 16:41:37.623	message:OpenAsync: Setting (stealing) m_enlistedTransaction: transaction 17561922 for connection 60800190. 
	Jun 1, 2021 @ 16:41:37.621	message:OpenAsync: Unsetting (donating) m_enlistedTransaction: transaction 17561922 for connection 48950176. 
	Jun 1, 2021 @ 16:41:37.593	message:DoCloseAsync: Setting (stealing) m_enlistedTransaction: transaction 17561922 for connection 48950176. 
	Jun 1, 2021 @ 16:41:37.592	message:DoCloseAsync: Unsetting (donating) m_enlistedTransaction: transaction 17561922 for connection 57171047. 
	Jun 1, 2021 @ 16:41:37.305	message:EnlistTransaction: Adding to s_transactionConnections: transaction 10 for connection 57171047. 
	Jun 1, 2021 @ 16:41:37.262	message:EnlistTransaction: Setting m_enlistedTransaction: transaction 10 for connection 57171047. 

Timovzl avatar Jun 01 '21 15:06 Timovzl

Meanwhile, I have performed the logging once more, but this time with the actual Transaction.GetHashCode(). I can confirm that DoCloseAsync() looks up the expected Transaction. I'm still unclear as to why it's not in the dictionary. I would guess that EnlistTransaction() has to be the cause of that somehow.

Timovzl avatar Jun 02 '21 13:06 Timovzl

@bgrainger I seem to have figured out where the problem originates:

In MySqlConnection.EnlistTransaction(), m_enlistedTransaction.Start() is executed. However, on the problematic environment, the next statement, lock (s_lock), is never reached.

I have determined this by logging before each of those statements

I'm using a StandardEnlistedTransaction (UseXaTransactions=False).

Edit: Please disregard what follows here. See my next comment.

The most likely perpetrators are:

  • Transaction!.EnlistVolatile(this, EnlistmentOptions.None)
  • cmd.ExecuteNonQuery() // set session transaction isolation level {isolationLevel};
  • cmd.ExecuteNonQuery() // start transaction{consistentSnapshotText};

Is it acceptable that a MySqlCommand is reused in StandardEnlistedTransaction.OnStart()? I seem to recall having issues with command reuse in the past. (Note that this executes two non-async queries. It might be a nice improvement to make this a single roundtrip, especially since it is synchronous.)

Timovzl avatar Jun 02 '21 13:06 Timovzl

I think the search should continue in the direction of XaEnlistedTransaction.Start().

As it turns out, a devops member had copy/pasted the connection string options for the alpha environment from who-knows-where. Among other issues, UseXaTransactions=False was missing, causing XA transactions to be unintentionally used.

With UseXaTransactions=False, the issue disappears for me. That explains the impossibility of reproducing it locally.

(I was used to getting an exception if this property was missing, because Percona XtraDb does not support XA transactions. Maybe support for it was added in version 8 and that's why I never noticed.)

Timovzl avatar Jun 02 '21 14:06 Timovzl

FWIW, the TransactionScope code is more complex, due to the support for distributed transactions. If you're disabling that anyway, and just using regular transactions, it's probably a lot simpler and more reliable to just use the ADO.NET DbTransaction API (i.e., using var transaction = connection.BeginTransaction();).

even equality of Transaction is based entirely on its InternalTransaction.TransactionHash

That does seem less robust than might be expected. Probably also worth confirming that it's immutable.

bgrainger avatar Jun 03 '21 13:06 bgrainger

It doesn't look like there's a strong uniqueness guarantee? https://referencesource.microsoft.com/#system.transactions/System/Transactions/TransactionTable.cs,237 ~~That could be a problem?~~ In practice, the hash code of each transaction appears to be set to a monotonically-increasing integer.

bgrainger avatar Jun 03 '21 13:06 bgrainger

it's probably a lot simpler and more reliable to just use the ADO.NET DbTransaction API

You're right, of course. In my case, the connections (and commands and readers) are operated behind-the-scene. The repository itself doesn't concern itself with such low-level types, and thus has no access to the connection object used to perform the query. (At best, it could get a fresh one.)

In practice, the hash code of each transaction appears to be set to a monotonically-increasing integer.

Hm, if it doesn't reset, then it seems there's no issue there.

Timovzl avatar Jun 14 '21 13:06 Timovzl

@bgrainger, @carlos-ferreira Has either of you made any further discoveries?

The issue seemed to have disappeared when we added the missing UseXaTransactions=False, but it has actually occurred again now, even with that option.

Timovzl avatar Nov 19 '21 12:11 Timovzl

No, I haven't made any progress on investigating this.

bgrainger avatar Nov 19 '21 14:11 bgrainger

@bgrainger I have some good news. My colleagues have figured out what caused the KeyNotFoundException!

In this case, we were using a TransactionScope for the sole purpose of lowering the isolation level to READ COMMITTED, for a read query over a potentially large data set. (Side question: Is it safe to set the session isolation level using a custom query? Will that not affect reuse when the connection is returned to the pool? I was unsure.)

The issue occurred when the transaction took more than 10 minutes, which happens to be the limit we have configured on the transaction scope. (I believe the default is 1 minute and the maximum default you are allowed to set is 10 minutes.)

Hopefully this helps in finding the cause. The good news is that the exception might merely be obscuring another (more accurate) exception.

(This current occurrence of the issue is about to be resolved for us, since we are adjusting the default isolation level for our databases to READ COMMITTED and can thus remove the transaction scope.)

Timovzl avatar Nov 22 '21 17:11 Timovzl

Side question: Is it safe to set the session isolation level using a custom query? Will that not affect reuse when the connection is returned to the pool?

By default, Connection Reset = True, so all connections will be reset when they're retrieved from the pool, which will reset session state to the server default. (However, if you have explicitly set Connection Reset = False; you will need to adjust your application code to account for the fact that session state may "leak" between separate MySqlConnection objects.)

bgrainger avatar Nov 27 '21 16:11 bgrainger

Hi, I got the same error regardless of the release or debug stage of the Asp .net core 5.0 server backend. after a ton of debugging and testing all the solutions, I figured out this problem is somehow related to MySQL DBMS version because in my local machine I got MySQL 5.7.38 and in my Microsoft windows server 2019 I had MySQL v8.x and the problem keep happening on my server. so I changed the MySQL image (docker image) version to 5.7.38 and without any configuration on CharSet worked perfectly fine for me. test it out by changing the MySQL DBMS version, then give feedback because this is an awesome library and if it gets fixed we can work with the newest MySQL version.

The connection string:

"server=<DBMS-IP>; port=<DBMS-PORT>; database=<DB-NAME>; user=<DB-USER>; password=<DB-PASSWORD>; Persist Security Info=False; Allow User Variables=True; Connect Timeout=300"

Hangefire configurations:

....
new MySqlStorage(
                _MysqlConnectionString, //mentioned before
                new MySqlStorageOptions
                {
                    QueuePollInterval = TimeSpan.FromSeconds(15),
                    JobExpirationCheckInterval = TimeSpan.FromHours(1),
                    CountersAggregateInterval = TimeSpan.FromMinutes(5),
                    PrepareSchemaIfNecessary = true,
                    DashboardJobListLimit = 50000,
                    TransactionTimeout = TimeSpan.FromMinutes(1),
                }
                )
            )
....

thanks for your consideration.

mmRoshani avatar Jun 27 '22 11:06 mmRoshani