efcore icon indicating copy to clipboard operation
efcore copied to clipboard

JOIN instead of CROSS APPLY with SelectMany-nested Select

Open amyboose opened this issue 9 months ago • 1 comments

What problem are you trying to solve?

My code example:

using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;
using Npgsql;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace EfInvalidSql;
public class Program
{
    public static async Task Main(params string[] args)
    {
        MyContext context = new();
        context.Database.EnsureDeleted();
        context.Database.EnsureCreated();

        var marks = await context.Set<User>()
            .SelectMany(u => u.Marks.Select(m => new
            {
                UserId = u.Id,
                MarkId =m.Id,
                Value = m.Value
            }))
            .ToListAsync();
    }
}


public class User
{
    public int Id { get; set; }
    public string Name { get; set; }
    public List<UserMark> Marks { get; set; }
}

public class UserMark
{
    public int Id { get; set; }
    public int Value { get; set; }
}

public class MyContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        NpgsqlDataSourceBuilder dataSourceBuilder = new(@"Host=localhost;Port=7435;Database=testdb;Username=admin;Password=testpass");

        NpgsqlDataSource dataSource = dataSourceBuilder.Build();

        //optionsBuilder
        //    .UseSqlServer(@"Server=(localdb)\mssqllocaldb;Password=somepass;Database=helloappdb;Trusted_Connection=True;")
        //    .LogTo(Console.WriteLine, LogLevel.Information)
        //    .EnableSensitiveDataLogging();

        optionsBuilder
            .UseNpgsql(dataSource)
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<User>();
        modelBuilder.Entity<UserMark>();
    }
}

SQL query (MS SQL):

SELECT [u1].[UserId], [u1].[MarkId], [u1].[Value]
      FROM [User] AS [u]
      CROSS APPLY (
          SELECT [u].[Id] AS [UserId], [u0].[Id] AS [MarkId], [u0].[Value]
          FROM [UserMark] AS [u0]
          WHERE [u].[Id] = [u0].[UserId]
      ) AS [u1]

SQL query (Postgres):

SELECT u1."UserId", u1."MarkId", u1."Value"
  FROM "User" AS u
  JOIN LATERAL (
      SELECT u."Id" AS "UserId", u0."Id" AS "MarkId", u0."Value"
      FROM "UserMark" AS u0
      WHERE u."Id" = u0."UserId"
  ) AS u1 ON TRUE

Describe the solution you'd like

SQL can be simplified. For example, for Postgres:

select u."Id" "UserId", m."Id" "MarkId", m."Value" "Value"
from "User" u
join "UserMark" m on m."UserId" = u."Id"

I use EF Core 9.0.3

amyboose avatar Apr 04 '25 18:04 amyboose

Thanks @amyboose that indeed makes sense - EF SQL quailty in some areas isn't as good as it should be. We already have some issues tracking improving SQL around joins (e.g. #17936), but I'll keep this one specifically for the SelectMany case.

Note that if you use the overload of SelectMany with a result selector, you get the better SQL (so definitely do that):

var marks = await context.Users
    .SelectMany(u => u.Marks, (u, m) => new
    {
        UserId = u.Id,
        MarkId = m.Id,
        Value = m.Value
    })
    .ToListAsync();

SQL:

SELECT [u0].[Id], [u0].[UserId], [u0].[Value]
FROM [Users] AS [u]
INNER JOIN [UserMark] AS [u0] ON [u].[Id] = [u0].[UserId]

The same happens if the projection to an anonymous type is removed. So this is specifically an issue with embedding a Select within the first (non-result) selector of SelectMany.

roji avatar Apr 04 '25 19:04 roji