EntityFramework-Extensions icon indicating copy to clipboard operation
EntityFramework-Extensions copied to clipboard

Feat: Conditional insert

Open tsanton opened this issue 1 year ago • 23 comments

One of the, in my option, larger limitations with EF at the moment is the lack for conditional singleton inserts.

My current case is as follows: I'm allowing users to manipulate a history table, but with certain limitations. For instance I will allow them to create a new statuses, but that status can't be backdated with "valid_from" <= min(valid_from) where entity was created (status == 'created').

As of now I have to look up the entity (or run an .Any() with a predicate), and then insert if it passes the predicate, whereas I'd much rather just fire off IQueryable.Where(prediates).ConditionalInsertAsync(Entity) and return the count from the output to see if one went in or if 0 inserted (and then return conditional responses based on the feedback).

In terms of design (at least for Postgres) I'm thinking something along these lines:

CREATE TEMP TABLE proof_of_concept (
    id UUID PRIMARY KEY,
    name TEXT,
    age INTEGER,
    created TIMESTAMP WITHOUT TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

--Empty table
select * from proof_of_concept;

--Initial insert
INSERT INTO proof_of_concept
select gen_random_uuid() id, 'Al Coholic' name, 10 age, timezone('utc', now()) created;

--Al made his way into the model
select * from proof_of_concept;

--Do not insert because the new age (9) is lesser than his his current max age.
with data as(
    select 'Al Coholic' name, 9 age
)

INSERT INTO proof_of_concept
select gen_random_uuid() id, data.name, data.age, timezone('utc', now()) created from data
where data.age > (select max(poc.age) from proof_of_concept poc where name = data.name)
RETURNING *;

--As we can see, both from RETURNING and select: no insert
select * from proof_of_concept;

--Do insert because the new age (11) is greater than his his current max age (10)
with data as(
    select 'Al Coholic' name, 11 age
)

INSERT INTO proof_of_concept
select gen_random_uuid() id, data.name, data.age, timezone('utc', now()) created from data
where data.age > (select max(poc.age) from proof_of_concept poc where name = data.name)
RETURNING *;

--And here we have it: conditional insert
select * from proof_of_concept;

I'm posting the suggestion here firstly because I think a lot of the required pipework for this extension already exist within the existing code base. Further I think it's a killer extension that I'm somewhat perplex that I can't find an implementation for -> it surely would save a lot of time and boilerplate code.

I can also say that though it's on the EF core radar (here) I would not put money on it making the EF core 9 cut. Nor is it completely clear to me if the design supports the conditional bit.

Hoping to hear back from you and I'd be happy to help with other SQL-provider syntax research or whatever you feel you might need in order to get this into either extension or plus!

/T

tsanton avatar Apr 29 '24 14:04 tsanton

Hello @tsanton ,

Here are a few questions to better understand your requirement

  1. Are you using PostgreSQL?
  2. Are you looking to provide a list of entities to insert or looking to insert from existing entities (already in the database)? Such as what InsertFromQuery does?
  3. Is an option like InsertStagingTableFilterFormula that will allow you to write your own SQL (for the filtering part in the WHERE statement) and filter data before they get inserted enough for you?
    • NOTE: This option works for SQL Server but not yet for PostgreSQL

Let me know more about your scenario. Depending on your answer, we might already be very close to supporting it.

Best Regards,

Jon

JonathanMagnan avatar Apr 29 '24 14:04 JonathanMagnan

Hi @JonathanMagnan!

I'm on Postgres, yes. I would like to utilise this in place of EF.AddAsync (so not from query, but single parameterised insert from code). I am agnostic to how this is solved on the backend, with a CTE/temp-table with insert from query, or parameterized select :)

I'm already running this pattern for updates and deletes. Here is how I implement a conditional delete of "SomeEntity" based on the ValidFrom queryable predicate

var query = from ua in context.SomeRandomEntity
    let minValidFrom = (
        from x in context.SomeRandomEntity
        where x.TenantId == tenantId && x.SomeId == pred.SomeId
        select x.ValidFrom
    ).Min()
    where ua.TenantId == tenantId && ua.SomeId == pred.SomeId && ua.Id == pred.Id && ua.ValidFrom > minValidFrom
    select ua;

var strategy = context.Database.CreateExecutionStrategy();
return await strategy.ExecuteAsync(async () =>
{
    await using var transaction = await context.Database.BeginTransactionAsync(ct);
    var deleted = await query.ExecuteDeleteAsync(ct);
    if (deleted != 1)
    {
        await transaction.RollbackAsync(ct);
        return false;
    }
    await transaction.CommitAsync(ct);
    return true;
});

tsanton avatar Apr 29 '24 15:04 tsanton

Thank you for the additional information.

We will work on it.

Best Regards,

Jon

JonathanMagnan avatar Apr 29 '24 15:04 JonathanMagnan

Hi again @JonathanMagnan, hope your enjoyed your vacation!

Just wondering if there are any news on this subject and if/when one hopefully can expect to see it live? :)

/T

tsanton avatar May 27 '24 07:05 tsanton

Hello @tsanton ,

My vacation was great; it was the best one I've had so far!

My developer provided me with a fix; the code is currently under code review.

If the code is accepted, the fix will be deployed on June 11 or June 18.

Best Regards,

Jon

JonathanMagnan avatar May 28 '24 15:05 JonathanMagnan

@JonathanMagnan that is fantastic news (both that your vacation was a blast and that the feature is rolling through)!

I'll order the champaign and schedule in a tenative refactoring session on my side :)

Keep up the good work and godspeed on your upcoming June deployment; can't wait!

/T

tsanton avatar May 28 '24 15:05 tsanton

Hello @tsanton ,

I just want to confirm that the code has been merged.

The only problem with this option at this moment is we haven't succeeded in making it compatible with the option InsertIfNotExists (For PostgreSQL), but since you don't use it, that will not be a problem on your side.

We are still targeting June 18 for our next release 🍾 🥂

Best Regards,

Jon

JonathanMagnan avatar Jun 12 '24 16:06 JonathanMagnan

Hello @tsanton ,

The v8.103.0.0 has been finally released.

In this version, we added the support to InsertStagingTableFilterFormula.

Here is an example:

context.BulkInsert(list, option => { 
	option.InsertStagingTableFilterFormula = "\"ColumnInt\" > 10"; 
});

Let me know if that option works correctly for you or if you need help to implement it.

Best Regards,

Jon

JonathanMagnan avatar Jun 18 '24 23:06 JonathanMagnan

Hi @JonathanMagnan and sorry for the late reply -> have been busy elsewhere in my backlog so have not had the time to look into it before now.

So I do fear we might have talked past each other as my need is for single entity inserts. Here is the pattern that I'm trying to do away with:

/// <summary>
/// You can only add a new status if the new ValidFrom is greater than the latest ValidFrom
/// </summary>
public override async Task<int> AddAsync(Guid tenantId, LeaseStatusEntity entity, CancellationToken ct = new())
  {
      var context = await _factory.GetDbContext(tenantId);
      
      var maxDate = await context.Status.AsNoTracking()
          .Where(x => x.TenantId == tenantId && x.LeaseId == entity.LeaseId)
          .MaxAsync(x => (DateTime?)x.ValidFrom, ct);

      if (maxDate.HasValue && entity.ValidFrom < maxDate) return -1;
      
      await context.Status.AddAsync(entity, ct);
      return await context.SaveChangesAsync(ct);
  }

Which for me translates into one of the following SQL queries:

insert into XXXStatus (tenant_id, lease_id, status_id, status, valid_from.......)
select 
    '<UUID1>', '<UUID2>', '<UUID3>, 'ACTIVE', '2000-01-01', ......
where not exists (
    select 1 from XXXStatus where tenant_id = '<UUID1>' LeaseId = '<UUID2>' and valid_from > '2000-01-01'
);

or it can take the form of

insert into XXXStatus (tenant_id, lease_id, status_id, status, valid_from.......)
select 
    '<UUID1>', '<UUID2>', '<UUID3>, 'ACTIVE', '2000-01-01', ......
where valid_from > (select min(valid_from) from XXXStatus where tenant_id = '<UUID1>' LeaseId = '<UUID2>' and valid_from > '2000-01-01');

Not complete sure how I can achieve this with the InsertStagingTableFilterFormula 🤔

tsanton avatar Jun 23 '24 10:06 tsanton

Hello @tsanton ,

Thank you for providing the code.

Here is an example using a similar entity as you:

context.BulkInsert(list, option =>
{
	option.InsertStagingTableFilterFormula = $"\"{nameof(EntitySimple.ValidFrom)}\" > (SELECT MAX(\"{nameof(EntitySimple.ValidFrom)}\") FROM \"EntitySimples\" AS X WHERE X.\"{nameof(EntitySimple.TenantID)}\" = StagingTable.\"{nameof(EntitySimple.TenantID)}\" AND X.\"{nameof(EntitySimple.LeaseID)}\" = StagingTable.\"{nameof(EntitySimple.LeaseID)}\")";
});

In this example, I was able to create an SQL statement similar to the one you provided. I just changed the min for max as I believe this is the behavior you want.

Looking at your requirement, I will recommend you instead to use a combo of some other options:

context.BulkInsert(list, option => {
	option.InsertIfNotExists = true;
	option.ColumnPrimaryKeyExpression = x => new { x.TenantID, x.LeaseID };
	option.InsertPrimaryKeyAndFormula = $"DestinationTable.\"{nameof(EntitySimple.ValidFrom)}\" > StagingTable.\"{nameof(EntitySimple.ValidFrom)}\"";
});

This code will probably be easier to read and maintain. The entity will only be inserted if a row with a more recent date (for the same TenantID and LeaseID) doesn't already exist.

Let me know if that makes sense.

Best Regards,

Jon

JonathanMagnan avatar Jun 24 '24 14:06 JonathanMagnan

Hello @tsanton,

Since our last conversation, we haven't heard from you.

Let me know if you need further assistance.

Best regards,

Jon

JonathanMagnan avatar Jun 27 '24 12:06 JonathanMagnan

Hi @JonathanMagnan and sorry for the late reply: took me a few days to get around to one of these issues.

So I currently have this code working like a charm. I must say it's not the prettiest I've ever written, but this insert patterns sure has one major advantage: no foreign key constraint issues when trying to exist a non-existing entity due to my model setup and the valid_from > null predicate (which is very nice)

var context = await _factory.GetDbContext(tenantId);
        
var resultInfo = new Z.BulkOperations.ResultInfo();
        
await context.BulkInsertAsync([entity], options =>
{
    options.UseRowsAffected = true;
    options.ResultInfo = resultInfo;
    options.InsertStagingTableFilterFormula = $"valid_from > (SELECT MAX(valid_from) FROM {XXXConfig.SchemaName}.{YYYEntityConfiguration.TableName} AS X WHERE X.tenant_id = StagingTable.tenant_id AND X.ZZZ_id = StagingTable.ZZZ_id)";
}, ct);

return resultInfo.RowsAffectedInserted;

Though I am happy with the performance of it, as you see it's not refactoring proof. As you may gather EF is configured to use snake_case naming. That makes the NameOf(entity.TenantId) option not viable -> do you know of any good way to reference the snake_cased variable name from the EFcontext (or some other equivalent way of achieving this)?

In terms of functionality: superb. Ease of use: not so much.

May I ask if it's possible to request extension methods on an IQueryable such as IQueryable.InsertWhereExists(entity) and IQueryable.InsertWhereNotExists(entity) which creates an insert into and appends a where <not> exists( <select from IQueryable>). This would be a pure single instance insert which covers most (all?) of my transactional use cases and would enhance ease of use in comparison to the solution above.

Don't get me wrong: I'm a happy camper and keeping the code as is, but I would like my juniors to 1) not to make fun of me and 2) be able to repeat this pattern without intimate SQL understanding.

As always: many thanks and great delivery speed!

/T

tsanton avatar Jul 02 '24 20:07 tsanton

Hello @tsanton ,

I'm happy to hear that you succeeded in making it work.

Regarding your point about being "easy to use," for a basic scenario, we have our InsertIfNotExists option, as I have shown in my second example. However, your case is not considered basic because you must also look at the maximum date.

Thank you for your suggestion about the IQueryable.InsertWhereNotExists(entity). It makes indeed scenario that is more complex like your very easy to use, I now understand your point about this part. That could be actually something possible. We will look at it probably at the beginning of next month to see if that is something we could add in the future.

do you know of any good way to reference the snake_cased variable name from the EFcontext

We don't have any public method that currently allows this, but we will look at it as well as we store this information within our library.

It's possible to do it by getting the entity type from the model and then finding the right property, and finally call after the GetColumnName method. However, it looks a little bit complicated for what you need to do.

Best Regards,

Jon

JonathanMagnan avatar Jul 03 '24 07:07 JonathanMagnan

Hi @JonathanMagnan, did you guys ever look into IQueryable.InsertWhereNotExists(entity) & IQueryable.InsertWhereExists(entity)? Would love to have this with a T? or int return where T? is your entity (maybe enriched with server side generated ids if people do that) or a "inserted" int for "how many records inserted".

Looking forward to hearing back from you!

tsanton avatar Aug 25 '24 09:08 tsanton

Hello @tsanton ,

In the past two weeks, we made a lot of progress on this good idea you provided. It's now one of our few priorities. So far, we really like the current direction and the potential it will provide to our library if we succeed in supporting it correctly.

We already have a working version, but some cases are currently limited, so we are working on it.

I will be able to provide a better update at the end of September (or perhaps before). As said, it has now become one of our priorities, so a lot of time will be put into this feature to make it happen, but I cannot promise anything yet.

Best Regards,

Jon

JonathanMagnan avatar Aug 25 '24 17:08 JonathanMagnan

That's great news! It will add something that's very much missing from EF and that (to my knowledge) nobody else are providing so I think it's a smart choice! Looking forward to hearing more about this -> I'll drop by this issue come end of September to ping for updates if I don't hear anything before that :)

Good luck!

/T

tsanton avatar Aug 25 '24 18:08 tsanton

Howdy @JonathanMagnan! A bit early for end of September, but I though I'd drop by and hear how it's going with the feature? :) Any significant progress/blockers, and if not: maybe an ETA?

Speak soon!

/T

tsanton avatar Sep 24 '24 09:09 tsanton

Hello @tsanton ,

This is pretty much the same status as last time. I was traveling for the first three weeks of September, so it was too hard to focus on a more complex JIRA like this one.

I will soon start to focus on the code my developer made. Normally, this process goes really fast as I take only one priority at a time. We are currently completing one of our priorities this week, and after this, we will re-evaluate to choose if this request become our first priority.

Best Regards,

Jon

JonathanMagnan avatar Sep 24 '24 12:09 JonathanMagnan