EFCore.BulkExtensions icon indicating copy to clipboard operation
EFCore.BulkExtensions copied to clipboard

BulkInsertIrUpdateAsync with SetOutputIdentity = true doesnt return Ids on saved entities

Open IliasP91 opened this issue 2 years ago • 15 comments

BulkInsertIrUpdateAsync with SetOutputIdentity = true doesnt return Ids on saved entities Tried on package versions 6.5.0-6.5.5

IliasP91 avatar Jul 29 '22 09:07 IliasP91

Can you write a test where the issue would be reproducible.

borisdj avatar Jul 29 '22 09:07 borisdj

also TrackingEntities on a BulkInsert doesn't seem to work for us, I'll try to create some tests during the weekend

IliasP91 avatar Jul 29 '22 10:07 IliasP91

It seems that I am not able to reproduce the issue where BulkInsertOrUpdateAsync was not populating the PK (Id) of the inserted entities. I am not constantly getting the following error when using some column names in the UpdateByProperties config. An item with the same key has already been added. Key: xxxx If I remove the UpdateByProperties entirely it adds the entities normally (going by the PK (Id) instead of a composite key. Any ideas why this is happening? I checked my list for duplicates on the composite key and they look fine.

However I was able to reproduce the issue with the TrackingEntities config in your tests.

Unit test showing the tracking entities issue: 2022-07-30_03-11-12

https://github.com/borisdj/EFCore.BulkExtensions/commit/da7c651a14854284519626e02f7524a8ee8a0e80?diff=split#diff-b5d7a6b82e61eec0de8f4c521d1b5213bd0daaaed3b3f5c35d29efd66fde82bb

IliasP91 avatar Jul 30 '22 02:07 IliasP91

For what its worth, I have been having similar issues for quite some time now. BulkInsertAsync no longer updates the ids even though my config sets SetOutputIdentity to true (var bulkConfig = new BulkConfig { SetOutputIdentity = true, BulkCopyTimeout = 0, UseTempDB = true }). It worked as expected when originally written in 2020 but broke along the way either due to an update in this library or .NET core (I noticed the problem for the first time in May of 2022 but its in very infrequently used code and may have actually been broken some months earlier) I believe that over time I have had many more issues when the destination db is an Azure db rather then a local one. Things (like this, and other errors I have been running into) will work locally but then fail when I switch destination to Azure

slubowsky avatar Aug 02 '22 15:08 slubowsky

any updates on this?

IliasP91 avatar Aug 14 '22 12:08 IliasP91

I believe I am having the same issue on BulkInsert where the ids are not set on the list of entities after the bulk insertion.

For example

context.BulkInsert(entityList, new BulkConfig { SetOutputIdentity = true });
context.BulkSaveChanges();

return Ok(entityList);

Where entityList is a populated List<Entity>. Inspecting the response from the API shows that none of the entities have ids set, as they are all null instead.

It was my assumption that the behavior should be to set those ids when the bulk insertion takes place but if I am mistaken please let me know what I can do to ensure the ids are set.

I am using:

EFCore.BulkExtensions (6.5.6)
Microsoft.EntityFrameworkCore.Tools (6.0.7)
Pomelo.EntityFrameworkCore.MySql (6.0.2)

stevewpatterson avatar Aug 15 '22 00:08 stevewpatterson

I figured out the issue but i'll leave my comment in case anybody else runs into this. I saw this line in the docs:

SetOutputIdentity have purpose only when PK has Identity (usually int type with AutoIncrement)

So I changed my Model to use public int Id instead of public long Id which I was using previously to map to my MySQL unsigned int type. Probably unnecessary anyways since I wont have any Ids greater than 2+ billion.

That fixed my issue, so it looks like in order for this to work your Model's PK type needs to be int.

stevewpatterson avatar Aug 15 '22 00:08 stevewpatterson

What Database are all using, for SqlServer seems to work fine.

borisdj avatar Aug 15 '22 08:08 borisdj

we have been using MSSQL

IliasP91 avatar Aug 15 '22 09:08 IliasP91

SQL server / Azure SQL server

slubowsky avatar Aug 15 '22 09:08 slubowsky

Here 2 issues are raised.

  1. SetOutputIdentity = true which I am not able o reproduce, and seems you either.

  2. TrackingEntities not saving afterwards changes from @IliasP91 example. This is somewhat specific case in your test. What happens is that TrackingEntities are only for Loading data from Db and they can be tracked but original ones are still not being. Only when list is reloaded they effectively become. There is note about this in the ReadMe:

SetOutputIdentity When used Id-s will be updated in entitiesList, and if PreserveInsertOrder is set to false then entitiesList will be cleared and reloaded.

For now you could try to set PreserveInsertOrder to false which will result in reloading the list that will then be tracked. I will see if this logic can be improved.

borisdj avatar Aug 15 '22 10:08 borisdj

Maybe in this case you should use BulkSaveChagnes instead of BulkInsert, since you want them tracked afterwards anyway.

borisdj avatar Aug 15 '22 10:08 borisdj

@borisdj thanks for taking a look at these as for the issues: 1 - was still happening on our solution, ill give it another go to try and reproduce it. 2 - so from what I understand PreserveInsertOrder is what also triggers it to track them after an insert right? I'll do some testing in my branch to see how it works exactly, at least from the docs it wasn't very clear so definitely a bit more clarity there on this use case would help imo

IliasP91 avatar Aug 15 '22 10:08 IliasP91

It's was more of a side effect, but can be used for some cases. Loaded ones are tracked when TrackingEntities set to True. And if PreserveInsertOrder is False then list have to be reloaded since match for update Id from DB can not be done, as the order might not be the same. That results in loaded and tracked entities to be in the list.

borisdj avatar Aug 15 '22 11:08 borisdj

Dropping in to share my findings...

I have been having this same exact issue, where any new records would not get a proper entity Id set. Instead it was always negative values, and found they correlated to the entity list size.

For example, 30,000 existing entities + 1 new => newId is -30,001

Snippet of bulk work

var entities = models.Select(x => ....).ToList();

await db.BulkInsertOrUpdateAsync(entities, new BulkConfig { SetOutputIdentity = true });

In testing, I noticed it was not an issue for

  1. Updates only (no zero Id's)
  2. Small batches ( < 10)

Adding PreserveInsertOrder = false, definitely solves the issue for new and updated records. However there are two issues that make it less than ideal. This is the first call of a complex graph where the entities have child collections and other ancillary work made simpler by retaining the existing entity list.

A scenario would be, the entity might not need updating, aka no values changed, but a child table has new records and/or changes. I noticed that any of these records were not coming back without preserving order.

After further digging into the generated SQL commands, noticed this was due to the condition

WHEN MATCHED
    AND EXISTS
    (
        SELECT...
        EXCEPT SELECT ...
    )
THEN
UPDATE SET ...

This particular method/routine was originally working like others mentioned, and never noticed this before, but after reviewing the config I found that OmitClauseExistsExcept was my silver bullet! I really hope this helps someone, and hopes diagnose where the problem might lay. I can confirm the SQL output includes the proper identity values.

For the TLDR;

I found the optimizer skipping unchanged entities via the flag OmitClauseExistsExcept solved all my issues

var entities = models.Select(x => ....).ToList();

await db.BulkInsertOrUpdateAsync(entities, new BulkConfig { SetOutputIdentity = true, OmitClauseExistsExcept = true, });

vspectre avatar Sep 12 '22 18:09 vspectre

It seems that I am not able to reproduce the issue where BulkInsertOrUpdateAsync was not populating the PK (Id) of the inserted entities. I am not constantly getting the following error when using some column names in the UpdateByProperties config. An item with the same key has already been added. Key: xxxx If I remove the UpdateByProperties entirely it adds the entities normally (going by the PK (Id) instead of a composite key. Any ideas why this is happening? I checked my list for duplicates on the composite key and they look fine.

However I was able to reproduce the issue with the TrackingEntities config in your tests.

Unit test showing the tracking entities issue: 2022-07-30_03-11-12

da7c651?diff=split#diff-b5d7a6b82e61eec0de8f4c521d1b5213bd0daaaed3b3f5c35d29efd66fde82bb

the "same key" error could be linked to this issue : #882 The corresponding pull request : #1062

TofAtWork avatar Jan 10 '23 08:01 TofAtWork

@vspectre

Dropping in to share my findings...

I have been having this same exact issue, where any new records would not get a proper entity Id set. Instead it was always negative values, and found they correlated to the entity list size.

Thank you for this, I had a similar issue and was able to resolve / understand a bit better with your explanations.

PhideasAmbrosianus avatar Jan 20 '23 03:01 PhideasAmbrosianus