efcore icon indicating copy to clipboard operation
efcore copied to clipboard

Union in collection subquery cannot be translated

Open InspiringCode opened this issue 1 year ago • 5 comments

When I run the following query in EF Core 8:

db.Users.Select(x => new AuthorizedUser
{
    Id = x.Id,
    DefaultLanguageCode = x.DefaultLanguageCode,
    AllPermissions = x
        .AssignedPermissions.Select(x => x.BusinessKey)
        .Union(x.Roles.SelectMany(x => x.Permissions, (_, p) => p.BusinessKey))
        .ToHashSet()
}).ToArray();

I get the error:

Unable to translate a collection subquery in a projection since either parent or the subquery doesn't project necessary information required to uniquely identify it and correctly generate results on the client side. This can happen when trying to correlate on keyless entity type. This can also happen for some cases of projection before 'Distinct' or some shapes of grouping key in case of 'GroupBy'. These should either contain all key properties of the entity that the operation is applied on, or only contain simple property access expressions.

User.AssignedPermissions and User.Roles are mapped as many-to-many relations. I tried several variants (including with Join, Distinct, explicit Select s etc), but all fail to translate. The Problem seems to be the Union operation (both sides of the union without the union seem to work fine).

Why is this?

InspiringCode avatar Oct 08 '24 11:10 InspiringCode

note to self: related to https://github.com/dotnet/efcore/issues/33410

maumar avatar Oct 09 '24 06:10 maumar

standalone repro:

    [ConditionalFact]
    public async Task Repro34849()
    {
        using (var ctx = new MyContext())
        {
            await ctx.Database.EnsureDeletedAsync();
            await ctx.Database.EnsureCreatedAsync();
        }

        using (var ctx = new MyContext())
        {
            var query = await ctx.Users.Select(x => new AuthorizedUser
            {
                Id = x.Id,
                DefaultLanguageCode = x.DefaultLanguageCode,
                AllPermissions = x.AssignedPermissions
                    .Select(x => x.BusinessKey)
                    .Union(x.Roles.SelectMany(x => x.Permissions, (_, p) => p.BusinessKey))
                    .ToHashSet()
            }).ToListAsync();
        }
    }

    public class MyContext : DbContext
    {
        public DbSet<User> Users { get; set; }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
        }

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            optionsBuilder.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Repro34849;Trusted_Connection=True;MultipleActiveResultSets=true");
        }
    }

    public class User
    {
        public int Id { get; set; }
        public string DefaultLanguageCode { get; set; }
        public List<Permission> AssignedPermissions { get; set; }
        public List<Role> Roles { get; set; }
    }


    public class Permission
    {
        public int Id { get; set; }
        public int BusinessKey { get; set; }
    }

    public class Role
    {
        public int Id { get; set; }
        public List<Permission> Permissions { get; set; }

    }

    public class AuthorizedUser
    {
        public int Id { get; set; }
        public string DefaultLanguageCode { get; set; }
        public HashSet<int> AllPermissions { get; set; }
    }

maumar avatar Oct 10 '24 01:10 maumar

problem is how we handle Union (SelectExpression -> ApplySetOperation). We try to union two queries:

SELECT p5.BusinessKey
FROM Permission AS p5
WHERE (u.Id != NULL) && (u.Id == p5.UserId)
SELECT p6.BusinessKey
FROM Role AS r2
INNER JOIN Permission AS p6 ON r2.Id == p6.RoleId
WHERE (u.Id != NULL) && (u.Id == r2.UserId)

those two queries on it's own have different identifiers (things we use to bucket the results accordingly, which are usually all the keys of all the entities). First one uses p5.Id, second one uses r.Id and p6.Id. Normally for set operations, identifiers on both sides must match (as we going to be combining them). Since they don't, we give up and mark the expression as not identifiable.

However, Union is a bit different, it removes the duplicates, so we can use the entire projection as identifier if need be. (we would still proper identifiers if they are available to avoid unnecessary projections, but we can fall back to the projection for case like this where proper identifiers don't work).

maumar avatar Oct 10 '24 01:10 maumar

@InspiringCode a hack/workaround you can use is to make both sides of the union having a matching shapes (in terms of number of entities that they use). You can do that by adding a superfluous left join to the first part. It doesn't matter what entity you join with (but it needs to be an entity, and needs to have the same number of primary key properties as Role) and by doing left join that never matches it's condition you don't add any data (which would get wiped out by DISTINCT anyway). It's quite hacky, but should get the job done.

            var query2 = await ctx.Users.Select(x => new AuthorizedUser
            {
                Id = x.Id,
                DefaultLanguageCode = x.DefaultLanguageCode,
                AllPermissions = (from xx in x.AssignedPermissions
                                 join dummy in ctx.Set<Foo>() on true equals false into grouping
                                 from dummy in grouping.DefaultIfEmpty()
                                 select xx).Select(x => x.BusinessKey)
                    .Union(x.Roles.SelectMany(x => x.Permissions, (_, p) => p.BusinessKey))
                    .ToHashSet()
            }).ToListAsync();

maumar avatar Oct 10 '24 01:10 maumar

interesting cases to test out:

  • scalar
  • multiple scalars,
  • non scalar (some function) - identifiers are columns and in places like Distinct we give up if projection contains a non-column. However in case of set ops, the result always ends up being a column, even if underlying expression is something else
  • entity - we should use it's PK for identifiers
  • complex type - use all properties, but this can only work for UNION (i.e. distinct)
  • keyless entity (?) - same as above
  • entity with a complex type - we still just need a PK (no need to add all complex property elements to the identifiers)
  • entity with composite key (just to make sure we look at all the PK columns)
  • complex projections (e.g. entity + a complex type) - those should be processed separately, i.e. use PK for the entity and all complex type properties, again this only works for UNION,
  • projection of scalars which contains the PK properties (we can use them)
  • same as above but for entity with composite key, only contains part of the key (shouldn't work, unless UNION)

maumar avatar Oct 10 '24 22:10 maumar