FlexLabs.Upsert MSSQL, UPDLOCK hint

Hi,

could you please add an UPDLOCK hint into generated SQL merge query?

currently an upsert operation is translated into this SQL code for MSSQL server.

MERGE INTO [StateChangeSnapshots] WITH (HOLDLOCK) AS [T] USING...

IMHO prefered query should be this:

MERGE INTO [StateChangeSnapshots] WITH (UPDLOCK, HOLDLOCK) AS [T] USING...

HOLDLOCK itself is not good enough to avoid deadlocks, this only imply that a serializable isolation level should be used for that table. And IL serializable itself does not prevent from deadlocks.

As you can see here:

I have plenty of such errors in my scenario.

You can simulate that error yourself with SSMS.

--first tab (connection)

SET TRAN ISOLATION LEVEL READ COMMITTED; --sql server default

DROP TABLE IF EXISTS ##myTable;
CREATE TABLE ##myTable(Id INT)

INSERT INTO ##myTable(Id) VALUES (1);

BEGIN TRAN t1

SELECT * FROM ##myTable with(UPDLOCK, HOLDLOCK) WHERE Id = 1;

--1) run to up here only, then switch to second connection (second tab in SSMS)

UPDATE MT SET Id = 2 FROM ##myTable MT WHERE Id = 1

--3) run UPDATE statement only (transaction for it should still be running)

--AS A RESULT one CONNECTION finishes and the second one ends up with a deadlock

COMMIT TRAN t1

--second tab (connection)

SET TRAN ISOLATION LEVEL READ COMMITTED; --sql server default

BEGIN TRAN t1

SELECT * FROM ##myTable with(UPDLOCK, HOLDLOCK) WHERE Id = 1;

UPDATE MT SET Id = 2 FROM ##myTable MT WHERE Id = 1

--2) run to here only

COMMIT TRAN t1

UPDLOCK does the thing. If you use it, then that query reserves those entries for its further updates.

See

In this case the second query stops (in "select" phase) and waits for the first query till it finishes the update operation and commits its transaction (or do a rollback), only then processing of the second query continues.

Dec 04 '20 18:12 martinrulf

I found if I have multiple upserts in flight at the same time I get this:

Microsoft.Data.SqlClient.SqlException (0x80131904): Transaction (Process ID 114) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction. at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteNonQueryAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteNonQueryAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteNonQueryAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.RelationalDatabaseFacadeExtensions.ExecuteSqlRawAsync(DatabaseFacade databaseFacade, String sql, IEnumerable1 parameters, CancellationToken cancellationToken) at FlexLabs.EntityFrameworkCore.Upsert.Runners.RelationalUpsertCommandRunner.RunAsync[TEntity](DbContext dbContext, IEntityType entityType, ICollection1 entities, Expression1 matchExpression, Expression1 updateExpression, Expression`1 updateCondition, RunnerQueryOptions queryOptions, CancellationToken cancellationToken)

I assume the above change will fix it? I'm having to find an alternative solution though.

Dec 08 '20 14:12 IanNorris

@IanNorris yes, I believe so.

Maybe I should have created this request rather as a bug than a feature request.

In my case is this behavior also a showstopper, I can't use it in my scenario because multiple merge requests are invoked at the same time.

Dec 08 '20 14:12 martinrulf

So I've dug into this and I found to eliminate the deadlock I had to also add OPTIONS (MAXDOP 1) to the end of the query. I tested this by executing the query - I wasn't able to repro the deadlock in my dev environment, I had to push the query to the test environment.

Unfortunately, using MAXDOP (max degree of parallelism) has increased the total runtime of the query to the point where its causing timeouts in other systems (I estimate making it 4x slower). As a result, my current solution is to actually go back to the naive solution of doing it manually and catching the index collision exception because that case is a lot rarer and the complex and slower MERGE INTO query is actually increasing the total collision rate.

Dec 09 '20 13:12 IanNorris

Hi @IanNorris, I did my investigation too. I got recommendation for the same hack with MAXDOP, but this is not good solution as you stated.

IMHO the problem is that you need to have some index which can be used to make the operation serializable in same order to avoid locking pages between the rows (given by HOLDLOCK hint)

To create an appropriate index depends on your scenario so you need to check your execution plan.

The problem with index is somewhat explained here: https://dba.stackexchange.com/a/23471

Eventually I realized, that I need to just avoid duplicates, so better option for me was the solution briefly described below.

In DB I created:

CREATE TYPE [dbo].[udtt_Table1] AS TABLE(
	[Column1] [INT] NOT NULL,
	[Column2] [NVARCHAR](50) NULL,
  /*...*/
)
--you can create this by modifying create script of the T (target) table in SSMS

CREATE PROCEDURE [dbo].[sp_Table1_UPSERT]
    @source AS dbo.udtt_Table1 READONLY
AS
BEGIN

INSERT INTO dbo.Table1
  (
    Column1,
    Column2, --nullable
    /*...*/
  )
SELECT
  S.Column1,
  S.Column1,
FROM
  @source AS S
  LEFT JOIN dbo.Table1 AS T WITH(XLOCK) 
    ON
      T.Column1 = S.Column1 
      AND ISNULL(T.Column2, -1) = ISNULL(S.Column2, -1)
      /*...*/
WHERE
  T.Id IS NULL

RETURN @@rowcount

END

from c# I call the SP like this:

public async Task<int> UpsertTable1Async(List<Table1> source)
{
    var source = new DataTable();

    //!!! Order of the columns have to be the same as the order specified in underlying udtt_Table1, column names does not work in case of udtt !!!

    source.Columns.Add(new DataColumn(nameof(Table1.Column1), typeof(int)));
    source.Columns.Add(new DataColumn(nameof(Table1.Column2), typeof(string)));
    //...

    foreach (var s in source)
    {
        var row = source.NewRow();

        row[nameof(Table1.Column1)] = s.Column1;
        row[nameof(Table1.Column2)] = s.Column2 == null ? (object)DBNull.Value : s.CarrierStateId;
        //...

        source.Rows.Add(row);
    }

    var sourceTableParam = new SqlParameter("@source", SqlDbType.Structured);
    sourceTableParam.Value = source;
    sourceTableParam.TypeName = "[dbo].[udtt_Table1]";

    var returnValueParam = new SqlParameter("@inserted", SqlDbType.Int);
    returnValueParam.Direction = ParameterDirection.Output;

    var affectedRowsByStoredProcedureCall = await Database.ExecuteSqlRawAsync("EXEC @inserted = dbo.sp_Table1_UPSERT @source", sourceTableParam, returnValueParam);
    var insertedCount = (int)returnValueParam.Value;

    return insertedCount;
}

Dec 09 '20 14:12 martinrulf

Wow, that's a complex solution. There's a few nice tricks in there that I wasn't aware of, so I'm going to dig into those.

Here's the query I ended up with:

BEGIN TRY INSERT INTO [Table] (...) VALUES( ... ); END TRY BEGIN CATCH UPDATE [Table] SET ... WHERE [Index1] = @index1 AND [Index2] = @index2; END CATCH

It definitively resolved the perf issues I was seeing as well and I've not seen a hint of the index deadlock or index collision issues the Upsert was to avoid. The Upsert with MAXDOP was taking 13h of compute time, the new version takes 13s for that same data set. This may be a viable solution for integration into the library, or for those trying to solve this problem for themselves.

Dec 09 '20 19:12 IanNorris

Nice and thanks, I am glad I could provide some hints :)

The Merge is very complex operation as you can see if you check the execution plan.

Probably you would be able to improve your first solution significantly (merge) by adding an appropriate index. (Which could also avoid the deadlocks)

BR

Dec 10 '20 10:12 martinrulf

Hey @martinrulf, @IanNorris

That's an interesting discussion you had there.. But I noticed that you mentioned that indexes could solve this too.

Did you not create a unique index on your selected match columns in your SQL server? I'd assume that this would pretty much be the first thing you'd do there. After all, if you don't have a unique index, someone could insert multiple rows with the same key, at which point the next MERGE call trying to update these rows would blow up spectacularly (from my memory.. either that, or would update the wrong data)

So yeah.. shouldn't HOLDLOCK combined with a unique index be enough?

Feb 28 '21 11:02 artiomchi

I don't know enough on the details on UPDLOCK vs HOLDLOCK to answer your question unfortunately.

I felt this was a good time to give a further update though. Since the last update I've since changed tactic again. I'm still using the try/catch upsert solution above, but I found once the dataset got large enough (a few million records) the insert got increasingly expensive and with Azure SQL serverless we were seeing huge spikes in duration. The upserts were made in batches from multiple incoming web connections, which eventually started timing out from the spikes in duration.

To resolve this I ended up creating a queue of incoming data, which is then upserted into the actual table with the unique index from a single thread and connection per server instance. The inserts to the queue were basically free, and the upserts on the actual table were significantly cheaper due to the reduced concurrency. We now have no timeout concerns as they were being handled by a worker thread too. This did add some latency to the updates (highest I've logged was 6 minutes), but this is acceptable in our use case, and the queue only gets briefly backlogged at busy times, being empty almost 100% of the time.

The performance savings from this approach (specifically IOPS) were so large we were able to scale down the SQL core count by an order of magnitude with no impact on performance elsewhere.

Mar 10 '21 02:03 IanNorris

Hi @artiomchi ,

It has been a while since I was digging into this.

Anyway, if I remember the issue correctly, the only thing how to get avoid of a deadlock in case there is more than one process potentially calling the upsert operation on the same table at the same time (i.e. in parallel) and your setup requires serializable isolation level.

It is a combination of UPDLOCK and HOLDLOCK (or maybe you can also use XLOCK) and correctly picked index preventing these two (or more) processes from acquiring locks on rows (or ranges of rows, or pages, etc) from different sides (different order) of the upserted table.

For more info please try to google for "sql merge deadlocking itself"

for example this thread https://dba.stackexchange.com/questions/23467/merge-statement-deadlocking-itself/23598 can give you some answers.

Anyway good explanation of HOLDLOCK and UPDATELOCK I got from this thread https://stackoverflow.com/questions/7843733/confused-about-updlock-holdlock

"Updlock, Holdlock is not the same as holdlock. Updlock, holdlock locks the rows for update and serializes your transaction. Holdlock by itself just serializes your transaction. It does not lock the selected rows for further access."

From my investigation I tried to call the same query with different hints specified, and then query for acquired locks by that query (that transaction), the result you can find below.

Mar 10 '21 08:03 martinrulf

FlexLabs.Upsert FlexLabs.Upsert copied to clipboard

MSSQL, UPDLOCK hint

FlexLabs.Upsert
FlexLabs.Upsert copied to clipboard