efcore icon indicating copy to clipboard operation
efcore copied to clipboard

Set operations over non-entity projections with different facets

Open yepeekai opened this issue 4 years ago • 26 comments

When doing a Union for the same result class but coming from different entity sources or with default values, it fails because of: InvalidOperationException: Set operations over different store types are currently unsupported even when types are identical except for maxLength attribute...

if (innerColumn1.TypeMapping.StoreType != innerColumn2.TypeMapping.StoreType) will fail for nvarchar(100) on one side and nvarchar(max) on the other side...

If I have a nullable string field with maxLength of 100. I can't do a union to something that has different maxlength or default value since default value will be interpreted as nvarchar(max)

I understand that this check is there to prevent conversion from one type to another that may produce unexpected results but I don't think it is the case here since both ends are nvarchar. There is an issue(16298) that tries to solve this in a smart way, but I think it will take some time before it gets released...

I think in the meantime (a a patch) that EF should allow different nvarchar(x) to match where x can vary.

Here is what I am trying to do:

mydbsetA.Select(a => new MyDto { Id = a.Id, Name = a.Name }) //maxLength(100)=>nvarchar(100)
.Union(mydbsetB.Select(b => new MyDto {Id = b.Id, Name = null })) //default to nvarchar(max)
.Union(mydbsetC.Select(c => new MyDto {Id = c.Id, Name = c.Name })) //maxLength(25)=>nvarchar(25)

Further technical details

EF Core version: 3.1 preview 3 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET Core 3.0 Operating system: Windows 7 IDE: Visual Studio 2019 16.3.10

yepeekai avatar Dec 02 '19 15:12 yepeekai

#16298 is about set operations over different entity types, whereas you are trying to perform a set operation over non-entity projections; the two are different problems, so we can use this issue to track this.

The tricky part of this is to understand database behavior around the differing facets. For example, in SQL Server a set operation over nvarchar(50) and nvarchar(100) seems to yield a type of nvarchar(200) (odd...), we'd have to investigate thoroughly and across databases to see exactly what should be done. Based on the results of that investigation, supporting this could be easy or hard, and therefore may or may not be appropriate for a 3.1 patch - but it's unlikely we'll get around to this anytime soon.

SELECT SQL_VARIANT_PROPERTY((SELECT CAST('foo' AS nvarchar(50)) UNION SELECT CAST('foo' AS nvarchar(100))), 'MaxLength'); -- Result is 200

Note that specifying store types explicitly (#4978) could help here, as you could manually up-cast one side to match the other, will add a note there.

roji avatar Dec 02 '19 16:12 roji

Of course, you can use client evaluation to temporarily work around this (by placing AsEnumerable appropriately).

roji avatar Dec 02 '19 16:12 roji

Thanks for all the explanations. Not sure why you closed it since it is not related to what I suggested and you said

we can use this issue to track this"

Unfortunately, client evaluation isn't something I want to do in this case because it is a paged source (I only retreive 25 records out of thousands). 4978 is a promising workaround but it is in the backlog :(

yepeekai avatar Dec 02 '19 17:12 yepeekai

One additional note: I don't know if it makes sense, but I think EF should use the type of the destination instead of the source in this kind of situation. I am selecting this to a Dto without any kind of length restriction.

yepeekai avatar Dec 02 '19 17:12 yepeekai

Sorry for closing, that was a mistake.

Depending on what exactly you're doing, you may be able to get away with adding limits (Take) on both sides of the set operation, and then pulling that and client-evaluating it - but that won't work if you have to apply the limit on the result of the set operation.

I don't know if it makes sense, but I think EF should use the type of the destination instead of the source in this kind of situation. I am selecting this to a Dto without any kind of length restriction.

I understand the logic, but things don't quite work like in the pipeline. The database has a specific, well-defined behavior when typing resulting columns of set operations, and it doesn't depend on what you're going to do with it; we need to understand and match that in EF Core. Also, it wouldn't be good for this to work only if you happen to select this to a DTO without a length restriction - a solution here would make it work in the general case.

roji avatar Dec 02 '19 17:12 roji

Related #15586

smitpatel avatar Dec 02 '19 17:12 smitpatel

@yepeekai I'm facing this issue too. You can use Convert.ToString(Name) to fool the logic and avoid that unnecessary exception ^^

ViRuSTriNiTy avatar Apr 08 '20 13:04 ViRuSTriNiTy

@ViRuSTriNiTy thanks, Convert.ToString worked in my case as well.

krajek avatar Jun 16 '20 08:06 krajek

@ViRuSTriNiTy you are my hero, this is awesome man, worked flawlessly to me too

AlbertoMonteiro avatar Aug 11 '20 02:08 AlbertoMonteiro

Thanks, @ViRuSTriNiTy !!

igorfarias30 avatar Aug 11 '20 02:08 igorfarias30

But have you seen the monster it creates on execution plan? image

rruntex avatar Oct 22 '20 07:10 rruntex

Thanks i had similar problem and though stack overflow i found link and finally able to get done trough any how with the suggesting above by convert.tostring thanks guys

bhupenparikh avatar Mar 10 '21 01:03 bhupenparikh

Implementation note: see specifically the case of a set operation over a column and a constant - the constant mapping should maybe be inferred from the column (see #24707).

roji avatar Apr 21 '21 15:04 roji

The same problem exists for Nullable DateTime fields. When you assign null or a field My work around was this.

NullableDate = f.Date

Doesn't work NullableDate = null // this is not equal for ef core

Workaround: NullableDate = f .Id != null ? null : f.Date, // pseudo logic to have the same data types for ef core evaluates to null

inforithmics avatar Dec 16 '21 09:12 inforithmics

For enum my workaround is:

.Select(guestPass => new PassTileInfo
    {
...
        GuestPassStatus = (PassStatus)(int)guestPass.Status,
...
    });

Barbanyaga avatar Dec 23 '21 12:12 Barbanyaga

I'm pretty sure the Convert.ToString works because it forces client evaluation, so when it performs the union, it's already evaluated both queries.

endlessxaura avatar Jan 25 '22 14:01 endlessxaura

Can you make an example of how and where to use the Convert.ToString() method?

eluchsinger avatar Jan 25 '22 19:01 eluchsinger

Re: 'The tricky part of this is to understand database behavior around the differing facets.' Why did you not just leave/reuse/migrate the logic from Entity Framework since v5.0 performed this query just fine? Please stop removing functionality that worked in previous versions. Fortunately I have only a few instances of this query to fix in EF core but I had hoped we delayed migrating to EF core long enough that it would be mature by now.

DreamU1952 avatar Mar 09 '22 18:03 DreamU1952

@roji

For example, in SQL Server a set operation over nvarchar(50) and nvarchar(100) seems to yield a type of nvarchar(200) (odd...)

SQL Server regularly reports the maximum byte-length for nvarchar, which is twice what you give as max-length.

select SQL_VARIANT_PROPERTY(CAST('foo' AS nvarchar(50)), 'MaxLength'); --yields 100
select SQL_VARIANT_PROPERTY(CAST('foo' AS nvarchar(100)), 'MaxLength'); --yields 200

-> https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=bf89ce29d587928cd242e98e7a51898f

So logic here is quite, what you would expect, but metadata for nvarchar is always *2.

As @DreamU1952 mentions this is a PITA. Providing an option (EF.Function?) to take a certain type to CAST/CONVERT to should already enable the user to solve this (albeit in a quite involved way).

TheConstructor avatar Jun 22 '22 10:06 TheConstructor

@TheConstructor sorry for not answering your comment earlier, that makes sense.

It may be safe to allow the same type with differing facets on both sides, and assume that the larger value wins; so allow nvarchar(100) and nvarchar(max), with the result being nvarchar(max). It may also be safe to do this for non-numeric facets (e.g. non-unicode + unicode = unicode, fixed length + non-fixed length = non-fixed length...). Of course, we'd need to test various scenarios (nvarchar, decimal, datetime) across databases to confirm this assumption.

That would still leave the case of different types on both sides, e.g. xml and nvarchar; these may either be implicitly convertible, or require an explicit cast. To know this we'd need to do https://github.com/dotnet/efcore/issues/15586, and I'm not sure it's a good idea for us to add an explicit cast ourselves, as opposed to the user specifying it (which would require EF.StoreType).

But handling the same-type-different-facets problem seems like it would take care of the majority case.

roji avatar Sep 13 '22 06:09 roji

Also see #14179 which is about same type but different facets (albeit in slightly different form)

smitpatel avatar Sep 13 '22 17:09 smitpatel

One more thought... Specifically for the case where there's a constant/parameter on one side of the set operation (https://github.com/dotnet/efcore/issues/19129#issuecomment-824148525), we should probably be inferring the type mapping from the other side like we do e.g. with binary operations. This doesn't require any type compatibility charts, and would probably cover quite a few user scenarios.

Opened #29081 to track this.

roji avatar Sep 14 '22 06:09 roji

I'm affected by this issue and have up-voted. I really think fixing this is key for operations like Union and Concat to be useful and powerful in EF Core. Thanks for the workarounds given.

One related question: in a DB-first scenario, what functional impact does specifying the max length on string properties have? I'm just wondering if I can fix most issues stemming from this in my solution by removing max length from the entity configurations, so that EF thinks all strings are nvarchar(max), even if not actually that in the DB.

stevendarby avatar Sep 20 '22 18:09 stevendarby

@stevendarby max length can be important for validation, i.e. making sure values don't exceed a certain length (but note that in SQL Server the length is in bytes, which may make it less useful for that). More importantly, SQL Server also doesn't allow indexes over nvarchar(max), which is a good reason to specify a facet. But at least in the PostgreSQL world, the facets indeed aren't that useful I'd say.

roji avatar Sep 20 '22 19:09 roji

@roji unless I misunderstood, those are reasons to specify lengths in SQL Server. Being DB-first and managing schema manually, I will still do that. But seems that not telling EF about the lengths means it won’t have this problem with Set operations.

stevendarby avatar Sep 20 '22 22:09 stevendarby

Oh I see, I didn't understand your question. Yeah, if you're not using EF to create your database, a reasonable workaround may be to let EF think that the columns are simply nvarchar(max).

roji avatar Sep 21 '22 08:09 roji

Is this caused by the same issue?

System.InvalidOperationException
  HResult=0x80131509
  Message=Unable to translate set operation when matching columns on both sides have different store types.
  Source=Microsoft.EntityFrameworkCore.Relational
  StackTrace:
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplySetOperation(SetOperationType setOperationType, SelectExpression select2, Boolean distinct)
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplyUnion(SelectExpression source2, Boolean distinct)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.TranslateUnion(ShapedQueryExpression source1, ShapedQueryExpression source2)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.CreateQueryExecutor[TResult](Expression query)
   at Microsoft.EntityFrameworkCore.Storage.Database.CompileQuery[TResult](Expression query, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.<>c__DisplayClass12_0`1.<ExecuteAsync>b__0()
   at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.ExecuteAsync[TResult](Expression query, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.ExecuteAsync[TResult](Expression expression, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ExecuteAsync[TSource,TResult](MethodInfo operatorMethodInfo, IQueryable`1 source, Expression expression, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ExecuteAsync[TSource,TResult](MethodInfo operatorMethodInfo, IQueryable`1 source, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.CountAsync[TSource](IQueryable`1 source, CancellationToken cancellationToken)
   at RailtraxCore.Pager.<ApplyFilterAsync>d__23`1.MoveNext() in D:\Users\jwood\source\repos\Railtrax\RailtraxCore\Pager.cs:line 117

Interestingly, I was able to get my issue work by running some string values through Convert.ToString(). I wish someone could explain what Convert.ToString() is doing that makes the difference. See my question on Stackoverflow.

SoftCircuits avatar Mar 29 '23 21:03 SoftCircuits

Is this caused by the same issue?

Looks like it.

I wish someone could explain what Convert.ToString() is doing that makes the difference.

Because EF Core loses track of the mapping, and so doesn't know that the two columns have different facets.

ajcvickers avatar Mar 30 '23 13:03 ajcvickers