efcore icon indicating copy to clipboard operation
efcore copied to clipboard

Do not track after SaveChanges()

Open smitpatel opened this issue 8 years ago • 30 comments

Based on https://stackoverflow.com/questions/44923147/how-to-avoid-inserted-entity-to-be-selected-after-insert-in-ef-core

Whenever we have generated columns (On insert or update), we do Select after insert/update to retrieve generated values to populate the entity in memory. In some cases, user may want just save the data and not care about updating them in memory any further (because not using them later).

At present there is no mechanism for this functionality.

smitpatel avatar Jul 08 '17 02:07 smitpatel

Try AsNoTracking:

https://msdn.microsoft.com/en-us/library/gg679352(v=vs.103).aspx

Giancarlos avatar Jul 17 '17 16:07 Giancarlos

@Giancarlos This issue is not related to AsNoTracking--that will tell EF not to track entities returned from the query. This issue is about SaveChanges ignoring store-generated values for already tracked entities.

ajcvickers avatar Jul 17 '17 16:07 ajcvickers

To the implementer:

  • Disable fixup when detaching to avoid messing up the FKs and navigations
  • Once implemented this should be used for data migrations.
  • Consider allowing the discriminator to be changed in this way

AndriySvyryd avatar Dec 06 '17 01:12 AndriySvyryd

It might be nice to be able to pass an optional "get me the entity" flag on a save changes call. That way when you need it add it, when you don't don't.

stap123 avatar Mar 15 '18 13:03 stap123

Are there plans for this to be implemented? I'm using Autofac + EF Core in a console application and this is really biting me.

leus avatar Jul 04 '19 19:07 leus

@leus this is our backlog milestone, meaning that we don't have specific plans to implement it. Could you elaborate on how this is biting you?

divega avatar Jul 04 '19 19:07 divega

Sure - I have a .NET Core console application that uses Autofac to create a big list of Tasks that receive a DbContext to do stuff. Everything works beautifully except I get a big hit on memory, and most of it is used by entities inserted. In the few places where we do queries we make sure of using AsNoTracking() (but we also use QueryTrackingBehavior.NoTracking when creating, so this is probably overkill).

image

Here the SqlChecklist.Processor.Writers.Sql.Model.TextData object is just a Entity that contains a big string (ntext).

My contexts are created as follows:

var dbOptions = new DbContextOptionsBuilder<ProcessorContext>();
dbOptions.UseSqlServer(connectionString);
dbOptions.UseQueryTrackingBehavior(QueryTrackingBehavior.NoTracking);

leus avatar Jul 05 '19 14:07 leus

Regarding this issue and also this one: https://github.com/aspnet/EntityFrameworkCore/issues/11852

As noted in previous comments and as i have already observed, using AsNoTracking queries or turning Tracking off for the whole context only prevents entities being added to the change tracker as a result of Query operations, SaveChanges still maintains inserted/updated entities in the change tracker.

@divega @ajcvickers Suggestion for a new configuration option:

I think that having a SaveChanges option (at context level and method level) which removes newly inserted entities from the change tracker after saving to the database would be useful. As with the original opening comment on this issue, when there are database generated columns such as auto-incrementing primary keys, with an option such as this enabled there would then be no need for EF to perform a select to retrieve the newly generated key value because the entity would effectively be discarded by EF after the save.

I use the description: "option which removes newly inserted entities from the change tracker after saving to the database" rather than: "option which prevents newly inserted entities from being added to the change tracker after saving to the database" because as far as i know the only way to insert an entity using SaveChanges is to first attach it to the change tracker using Context.Add(entity) so the entity will always be in an attached state before SaveChanges is called.

I think an option such as this would be less relevant when database generated columns are not used as there is no select query performed behind the scenes by EF so there would be no performance benefit to be provided by the option. Instead the entity can just be detached from the change tracker by using Entry(entity).State = EntityState.Detached; after calling SaveChanges, achieving the same result. Of course doing this does not prevent the behind the scenes select query from taking place right after the SaveChanges to retrieve any database generated key values that are then never used.

Maybe the option should instead be targeted at the SaveChanges behaviour when database generated columns are used, ie. don't go and get the updated values, just fire and forget, but this will cause issues with the change tracker unless the saved entity is detached? Having a side effect of detaching the entity only when database generated columns are in use is going to be really bad. It sounds much better as an an option to detach entities after saving, which has the performance improving side affect of preventing an unnecessary select query when database generated columns are used.

So rather than just detaching newly inserted entities, it would be better if the new option made SaveChanges detach any entity this it has just persisted to the database, both inserts and updates, similar to what @roji suggested here: https://github.com/aspnet/EntityFrameworkCore/issues/11852#issuecomment-385953097

So what i am suggesting is an option which changes the behaviour of Save operations so that they result in just fire and forget database writes and then detach the affected entities from the change tracker.

Use Cases:

I think an option such as this would be useful in scenarios where it is not possible or desirable to use BulkCopy tools. For example when bulk importing data into a database where depending on each record being imported, there sometimes needs to be query-then-update operations performed, or lookup-and-populate-related-entities, rather than just plain inserts.

In the above scenario it can be desirable to use EF for importing data, especially if it is already being used in the project and we already have the model to work with, rather than bringing in some additional framework or tools etc. The issue with using EF for inserting/updating large numbers of records is that the change tracker causes serious performance issues due to build up of entities unless the context is regularly disposed and re-created during the import cycle. This is never much of an issue for normal use cases such as Api and Mvc controllers where request scopes and DI make sure that the context is short lived.

To get the most performance benefit in most cases the context will need to be recreated after each record is imported, although the overhead of doing this is tiny compared to the performance benefit provided by starting with a fresh context and empty change tracker, it still seems wasteful. Another issue is that it does not always make sense or look nice to have to structure the import code such that the context is recreated after each record import.

To solve this kind of issue in the past i have used a mixture of context recreation and looping through the change tracker to detach all entities after each save. This is fine when using Guid's as keys which are generated by EF, but if using database generated keys there will still be wasteful select queries happening.

chrisckc avatar Oct 08 '19 10:10 chrisckc

Here is a method on the context that i have been using to clear out the change tracker:

public static void DetachAllEntities(this DbContext context) {
            var entries = context.ChangeTracker.Entries()
                .Where(e => e.State != EntityState.Detached)
                .ToList();

            foreach (var entry in entries) {
                if (entry.Entity != null) {
                    entry.State = EntityState.Detached;
                }
            }         
        }

But recently, i have tried using the suggestion here: https://github.com/aspnet/EntityFrameworkCore/issues/11852#issuecomment-386045894

public void ResetState() {
            ((IDbContextPoolable)this).ResetState();
            // https://stackoverflow.com/questions/46685823/how-to-use-the-executionstrategy-properly       
           
        ((IDbContextPoolable)this).Resurrect(((IDbContextPoolable)this).SnapshotConfiguration());
}

There appears to be no difference in terms of performance that i can measure, both seem to have the same affect.

As an example, clearing out the change tracker after each record import during an import of 220 records from a CSV file resulted in a 25x performance increase. Without clearing out the change tracker after each record, the import ends with 220 entities in the change tracker and takes 25 time longer to complete, so not a huge amount of entity build up but enough to cripple the performance.

In that example, when importing multiple files, i am recreating the context after each file import, but just clearing the change tracker after each record import.

I am sure this is one of the reasons why EF gets such a bad rep in terms of performance, you have to be very careful how it is used, if you are not fully aware of things like this it is easy to just dismiss it as slow and inefficient.

chrisckc avatar Oct 08 '19 10:10 chrisckc

a good way of getting a fresh context is to take advantage of the ServiceScope Factory:

foreach (FileInfo file in files) {
    string filePath = file.FullName;
    using (var scope = ServiceScopeFactory.CreateScope()) {
        var myCsvImporter = scope.ServiceProvider.GetRequiredService<MyCsvImporter>();
        var importResult = await myCsvImporter.ImportFromFileAsync(filePath);
        ........

    }
}

Of course MyCsvImporter could also use the same technique to create a fresh context for each record, i am not sure if this will be faster than clearing out the change tracker, but in this case i suspect it would not be measurable due the database writes taking up the vast majority of the time. The main reason not to create a fresh context for each record is due to code structure and readability and the desire to at least make some use of the framework provided DI convenience rather than always grabbing the required services from a ServiceScope.

I also agree with @roji here: https://github.com/aspnet/EntityFrameworkCore/issues/11852#issuecomment-386031370

chrisckc avatar Oct 08 '19 10:10 chrisckc

It looks like the performance benefit of this option will not just depend on the primary key being generated by EF or by the database, but also on the EF Core Provider used.

For example if using Npgsql and a Guid primary key, but with database generated values via the postgres uuid_generate4() function, it looks like there will still be no select query performed after the Save due to the use of the Postgres feature described in this comment:https://github.com/npgsql/Npgsql.EntityFrameworkCore.PostgreSQL/issues/81#issuecomment-236642569 Npgsql is using the Postgres "RETURNING" statement to avoid a second query to get the newly generated Guid.

@roji I am not sure how this is handled when there are other or multiple database generated values in a table?

chrisckc avatar Oct 08 '19 13:10 chrisckc

@chrisckc you're right that Npgsql has a somewhat lighter mechanism for retrieving database-generated values after insert. However, even for Npgsql, if we somehow know that we don't need to bring back database-generated values, we can omit the RETURNING clause and save some network bandwidth (for the values coming back), so this issue is still relevant.

Having multiple database-generated values doesn't change anything here - they should simply all appear in the RETURNING clause.

roji avatar Oct 09 '19 09:10 roji

my use case is only having INSERT not having SELECT permission (Audit Tables) In this case I just want to fire an forget

timker avatar Oct 13 '20 05:10 timker

@timker why not use ExecuteSqlRaw then?

ErikEJ avatar Oct 13 '20 05:10 ErikEJ

that would work, (or create a stored proc)

(and I'm happy to do either) but I would prefer to use ef types.

timker avatar Oct 13 '20 05:10 timker

This enforce you to write sql when you write backward compatible code. For example If I add a column and host has being updated but db hasnt, if this select was optional, the below code would work

DbContext.Entry(entity).Property(x => x.Firstname).IsModified = false;
DbContext.SaveChanges();

IMO this select should be optional.

asoursos avatar Oct 14 '20 12:10 asoursos

If anyone has the same backward compatibility issue, there is a better sln from raw sql. If you have the entity models classes isolated, you could copy these into a new namespace (**.Backward) and have a factory that can decide which dbcontext to choose.

asoursos avatar Oct 15 '20 17:10 asoursos

Note: benchmarking in https://github.com/dotnet/efcore/issues/27372#issuecomment-1033601750 shows that the fetching back generated values when inserting is mostly negligible perf-wise, at least on SQL Server. As such, the advantage of doing this would mainly be to unlock high-perf techniques which don't supported fetching back values, e.g. SqlBulkCopy.

roji avatar Feb 20 '22 11:02 roji

Design discussion:

  • We'll introduce a flag on ChangeTracker, which stops us from fetching back generated values, and which causes the change tracker to automatically be cleared after SaveChanges completes. There's no need to add a DbContextOption to change the default, since the ChangeTracker flag can be set in the context constructor (like other flags we already have).
  • Note that this isn't only about clearing the change tracker; it's also about not propagating generated values to entity instances which are already in memory.
  • When the flag is enabled, calling SaveChanges(true) should probably throw.
  • We considered disposing the context instead of clearing its change tracker, or otherwise setting some state which causes any operation to fail on the context after SaveChanges. The advantage would be to force users to instantiate a new context after SaveChanges, with fail-fast throwing in case the user gets it wrong. However, this is incompatible with context pooling, and some users consider creation of a new context too verbose (that's why we introduced ChangeTracker.Clear).

roji avatar Feb 20 '22 11:02 roji

Notes from triage:

  • Perf benefit seems minimal based on recent testing--if you are seeing otherwise, then please post details.
  • Clearing the change tracker after saving changes can be achieved by calling ChangeTracker.Clear after calling SaveChanges.
  • There is overlap here with bulk insert (#27333) which would likely be a better solution in many cases since it would avoid all tracking

For these reasons we are punting this for EF7, but still intend to implement it in the future.

ajcvickers avatar Apr 23 '22 09:04 ajcvickers

Re: perf benefit, it's hard to benchmark it when it hasn't been implemented, but I took the INSERT SQL generated in 6.0, then modified it to what SQL might look like if optimised not to return generated values, and benchmarked both. Not returning values was 2.5x faster.

Method Mean Error StdDev
BulkWithSelect 2,285.9 us 47.60 us 127.88 us
BulkWithoutSelect 829.4 us 27.91 us 78.71 us

I don't think it's too surprising that it's faster to not return the generated values, and is probably why you're still pursuing other methods for Bulk insertion. Perhaps I got the wrong end of the stick re: the first point about perf benefit being minimal?

Code

It uses a simple table with a composite key, an Id column that's DB-generated and a DateTime that has a default. I inserted a bunch of data before hand as I imagine the amount of data effects how it quickly it reads back generated values. It only benchmarks the SQL and not also reading the values back, setting them on entities etc. which would add more time.

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Microsoft.EntityFrameworkCore;
using System;
using System.Diagnostics;
using System.Linq;

{
    var summary = BenchmarkRunner.Run<InsertSpeed>();
}

public class InsertSpeed
{
    private MyContext _context;

    [GlobalSetup]
    public void Setup()
    {
        var random = new Random();

        using (var context = new MyContext())
        {
            context.Database.EnsureDeleted();
            context.Database.EnsureCreated();

            var people = Enumerable.Range(0, 500000)
                .Select(x => new Person { TenantId = random.Next(1, 6) })
                .ToList();

            context.AddRange(people);
            context.SaveChanges();
        }

        _context = new MyContext();
    }

    [Benchmark]
    public void BulkWithSelect()
    {
        _context.Database.ExecuteSqlRaw(
@"SET NOCOUNT ON;
DECLARE @inserted0 TABLE ([Id] int, [TenantId] int, [_Position] [int]);
MERGE [People] USING (
VALUES (3, 0),
(3, 1),
(3, 2),
(3, 3),
(3, 4),
(3, 5),
(3, 6),
(3, 7),
(3, 8),
(3, 9),
(3, 10),
(3, 11),
(3, 12),
(3, 13),
(3, 14),
(3, 15),
(3, 16),
(3, 17),
(3, 18),
(3, 19),
(3, 20),
(3, 21),
(3, 22),
(3, 23),
(3, 24),
(3, 25),
(3, 26),
(3, 27),
(3, 28),
(3, 29),
(3, 30),
(3, 31),
(3, 32),
(3, 33),
(3, 34),
(3, 35),
(3, 36),
(3, 37),
(3, 38),
(3, 39),
(3, 40),
(3, 41)) AS i ([TenantId], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([TenantId])
VALUES (i.[TenantId])
OUTPUT INSERTED.[Id], INSERTED.[TenantId], i._Position
INTO @inserted0;
      
SELECT [t].[Id], [t].[CreatedDate] FROM [People] t
INNER JOIN @inserted0 i ON ([t].[Id] = [i].[Id]) AND ([t].[TenantId] = [i].[TenantId])
ORDER BY [i].[_Position];");
    }

    [Benchmark]
    public void BulkWithoutSelect()
    {
        _context.Database.ExecuteSqlRaw(
@"SET NOCOUNT ON;
INSERT INTO [People] ([TenantId])
VALUES (3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3);");
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        _context?.Dispose();
    }
}

public class MyContext : DbContext
{
    public DbSet<Person> People { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder.UseSqlServer("Server=.;Database=SaveChanges;Trusted_Connection=True;");

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Person>(e =>
        {
            e.HasKey(x => new { x.TenantId, x.Id });
            e.Property(x => x.Id).ValueGeneratedOnAdd();
            e.Property(x => x.CreatedDate).HasDefaultValueSql("SYSUTCDATETIME()");
        });
    }
}

public class Person
{ 
    public int TenantId { get; set; }
    public int Id { get; set; }
    public DateTime CreatedDate { get; set; }
}

stevendarby avatar Apr 23 '22 19:04 stevendarby

@stevendarby you probably want to take a look at #27372, which was implemented and can be tested in 7.0.0-preview.3. In a nutshell, EF Core 7.0 uses a slightly different technique for bulk-inserting which still fetches generated values, but uses the OUTPUT close without a temporary table variable (OUTPUT, not OUTPUT INTO). I've updated your benchmark code to add the new technique used by EF Core 7.0 (see below), here are the results:

Method Mean Error StdDev
Bulk_with_Output_Into 6.126 ms 0.1180 ms 0.1405 ms
Bulk_with_Output 2.328 ms 0.0460 ms 0.0829 ms
Bulk_without_Output 2.276 ms 0.0668 ms 0.1905 ms
Bulk_with_insert 2.903 ms 0.0656 ms 0.1914 ms

In other words, the slowness in your benchmark comes not from fetching the actual values (sending those two ints+position are quite negligible in the grand scheme of things), but rather from the SQL Server temporary table being used. Now, I'm not saying further gains couldn't be reached by not fetching, but they currently seem quite minimal for most cases. Of course, if the value fetched is a huge string or binary data, the situation would be quite different, but that seems like quite an edge-case scenario that we haven't seen much. Another reason to remove the fetching is that it would allow using techniques which don't allow fetching at all (in theory even SqlBulkCopy could be used, in certain cases); but that goes in a complicated direction and we don't have concrete plans to do that right now. If we do go in this direction, this issue would be part of that.

It would be great if you can run the added benchmarks below and confirm that you see the same results. If you have any other ideas or input, of course that would be welcome!

I don't think it's too surprising that it's faster to not return the generated values [...]

I think this also shows how in perf, nothing is straightforward and one should avoid assumptions... Contrary to expectations (including my own initial ones!), the fetching of generated values isn't a significant factor here.

PS Any specific reason you're seeding the database with 500000 rows in this benchmark?

Updated benchmark code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Microsoft.EntityFrameworkCore;
using System;
using System.Linq;

BenchmarkRunner.Run<InsertSpeed>();

public class InsertSpeed
{
    private MyContext _context;

    [GlobalSetup]
    public void Setup()
    {
        var random = new Random();
        
        using (var context = new MyContext())
        {
            context.Database.EnsureDeleted();
            context.Database.EnsureCreated();
        
            // var people = Enumerable.Range(0, 500000)
            //     .Select(x => new Person { TenantId = random.Next(1, 6) })
            //     .ToList();
            //
            // context.AddRange(people);
            // context.SaveChanges();
        }

        _context = new MyContext();
    }

    [Benchmark]
    public void Bulk_with_Output_Into()
    {
        _context.Database.ExecuteSqlRaw(
@"SET NOCOUNT ON;
DECLARE @inserted0 TABLE ([Id] int, [TenantId] int, [_Position] [int]);
MERGE [People] USING (
VALUES (3, 0),
(3, 1),
(3, 2),
(3, 3),
(3, 4),
(3, 5),
(3, 6),
(3, 7),
(3, 8),
(3, 9),
(3, 10),
(3, 11),
(3, 12),
(3, 13),
(3, 14),
(3, 15),
(3, 16),
(3, 17),
(3, 18),
(3, 19),
(3, 20),
(3, 21),
(3, 22),
(3, 23),
(3, 24),
(3, 25),
(3, 26),
(3, 27),
(3, 28),
(3, 29),
(3, 30),
(3, 31),
(3, 32),
(3, 33),
(3, 34),
(3, 35),
(3, 36),
(3, 37),
(3, 38),
(3, 39),
(3, 40),
(3, 41)) AS i ([TenantId], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([TenantId])
VALUES (i.[TenantId])
OUTPUT INSERTED.[Id], INSERTED.[TenantId], i._Position
INTO @inserted0;
      
SELECT [t].[Id], [t].[CreatedDate] FROM [People] t
INNER JOIN @inserted0 i ON ([t].[Id] = [i].[Id]) AND ([t].[TenantId] = [i].[TenantId])
ORDER BY [i].[_Position];");
    }

    [Benchmark]
    public void Bulk_with_Output()
    {
        _context.Database.ExecuteSqlRaw(
            @"SET NOCOUNT ON;
MERGE [People] USING (
VALUES (3, 0),
(3, 1),
(3, 2),
(3, 3),
(3, 4),
(3, 5),
(3, 6),
(3, 7),
(3, 8),
(3, 9),
(3, 10),
(3, 11),
(3, 12),
(3, 13),
(3, 14),
(3, 15),
(3, 16),
(3, 17),
(3, 18),
(3, 19),
(3, 20),
(3, 21),
(3, 22),
(3, 23),
(3, 24),
(3, 25),
(3, 26),
(3, 27),
(3, 28),
(3, 29),
(3, 30),
(3, 31),
(3, 32),
(3, 33),
(3, 34),
(3, 35),
(3, 36),
(3, 37),
(3, 38),
(3, 39),
(3, 40),
(3, 41)) AS i ([TenantId], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([TenantId])
VALUES (i.[TenantId])
OUTPUT INSERTED.[Id], INSERTED.[TenantId], i._Position;");
    }

    [Benchmark]
    public void Bulk_without_Output()
    {
        _context.Database.ExecuteSqlRaw(
            @"SET NOCOUNT ON;
MERGE [People] USING (
VALUES (3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3)) AS i ([TenantId]) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([TenantId])
VALUES (i.[TenantId]);");
    }

    [Benchmark]
    public void Bulk_with_insert()
    {
        _context.Database.ExecuteSqlRaw(
@"SET NOCOUNT ON;
INSERT INTO [People] ([TenantId])
VALUES (3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3),
(3);");
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        _context?.Dispose();
    }
}

public class MyContext : DbContext
{
    public DbSet<Person> People { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder.UseSqlServer("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Trust Server Certificate=true");

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Person>(e =>
        {
            e.HasKey(x => new { x.TenantId, x.Id });
            e.Property(x => x.Id).ValueGeneratedOnAdd();
            e.Property(x => x.CreatedDate).HasDefaultValueSql("SYSUTCDATETIME()");
        });
    }
}

public class Person
{ 
    public int TenantId { get; set; }
    public int Id { get; set; }
    public DateTime CreatedDate { get; set; }
}

roji avatar Apr 24 '22 08:04 roji

@roji that looks really promising, and I’ll take a closer look soon for my own interest.

Re: the 500000 rows, the 6.0 SQL returns the generated Id and CreatedDate values by selecting from the People table, so I figured populating it with a healthy amount of data could help demo the impact of traversing the the index and reading from the table etc. Hopefully reading from the outputted ‘inserted’ values in 7.0 means the amount of existing data has very little impact, which is great!

stevendarby avatar Apr 24 '22 12:04 stevendarby

Re: the 500000 rows, the 6.0 SQL returns the generated Id and CreatedDate values by selecting from the People table, so I figured populating it with a healthy amount of data could help demo the impact of traversing the the index and reading from the table etc.

@stevendarby I see - though the lookup is by the Id (primary key), which uses an index and therefore shouldn't be affected too much by existing rows (though there indeed should be some impact).

Note that the new 7.0 SQL is the default, but does not work where there are triggers; EF 7.0 now allows you to declare that there's a trigger on a table, in which case we revert back to the 6.0 (inefficient) SQL.

roji avatar Apr 24 '22 12:04 roji

@roji FWIW I had another look at this with your updated benchmarks, plus a few tweaks of my own to check out a few scenarios, and I'm really satisfied that the 7.0 SQL you highlighted already has the performance improvements I previously thought could only come from implementing the changes proposed for this issue. I do see a slight gain in not returning the values at all, but it's nothing like the improvement from using a direct output instead of output into. Thank you. Not using triggers in my DB!

stevendarby avatar Apr 25 '22 09:04 stevendarby

Thanks for confirming @stevendarby! Yeah, we do intend to implement this issue at some point, but priority-wise it seems much less important than it previously did.

roji avatar Apr 25 '22 09:04 roji

+1, this would be a good option to have.

In EF Core 6, I've just come across a situation where being able to opt-out of returning a key would be beneficial.

Ef Core was generating the below SQL to insert an object into a request logs table. When under high load, the table was getting locked by the SELECT statement, meaning all requests were timing out and failing.

(@p0 varchar(255),@p1 bigint,@p2 varchar(255),@p3 varchar(2000),@p4 datetime,@p5 int,@p6 int)SET NOCOUNT ON;
INSERT INTO [request_logs] ([action], [application_row_id], [controller], [path], [request_timestamp], [response_time_ms], [status_code])
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6);
SELECT [row_id]
FROM [request_logs]
WHERE @@ROWCOUNT = 1 AND [row_id] = scope_identity();

In my specific case I don't care about EF core checking the objec, or selecting the primary key after insert. I just need to dump the data into the table and move on.

I ended up having to write a custom method on my DbContext to insert the record in:

public async Task<int> AddRequestLog(RequestLog log)
{
    return await Database.ExecuteSqlInterpolatedAsync(@$"insert into request_logs (controller, action, path, status_code, response_time_ms, application_row_id, request_timestamp)
values({log.Controller}, {log.Action}, {log.Path}, {log.StatusCode}, {log.ResponseTimeMilliseconds}, {log.ConfigurationRowId}, {log.RequestTimestamp} )");
}

But would prefer the option to do something like the below, which would prevent the round trip and generate the SQL accordingly.

DbContext.RequestLogs.Add(log, ignoreConcurrency = true);
DbContext.SaveChanges();

- Generated SQL:
(@p0 varchar(255),@p1 bigint,@p2 varchar(255),@p3 varchar(2000),@p4 datetime,@p5 int,@p6 int)SET NOCOUNT ON;
INSERT INTO [request_logs] ([action], [application_row_id], [controller], [path], [request_timestamp], [response_time_ms], [status_code])
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6);

I understand the SELECT has been replaced with an OUTPUT in EF Core 7, but the option to opt-out entirely would still be good to have.

DaleMckeown avatar Nov 16 '22 13:11 DaleMckeown

Try AsNoTracking:

https://msdn.microsoft.com/en-us/library/gg679352(v=vs.103).aspx

EDIT: It was probably not a solution. It was a problem on the migration side or an invalid way of db creation (which was on my side not on ef core). I moved to dbup to apply the migration and migration creation via dotnet CLI + migration script and it solved most of the problems.

It really helped me a lot, idk how, but it is, thank you @Giancarlos

Using Ef Core v3.1.32

sunnamed434 avatar Apr 06 '23 12:04 sunnamed434

Real-world case where users only have INSERT permissions: How to prevent primary key reading back after DbContext.Add().

An OUTPUT clause requires SELECT permissions. So this is another case where not reading any values back after insert is desired.

GertArnold avatar Aug 07 '24 09:08 GertArnold

Real-world case where users only have INSERT permissions: How to prevent primary key reading back after DbContext.Add().

An OUTPUT clause requires SELECT permissions. So this is another case where not reading any values back after insert is desired.

Dealing with this at the moment. Is there a solution that does not involve running raw sql

idris-yakub avatar Oct 23 '24 10:10 idris-yakub