efcore
efcore copied to clipboard
Generate table joins instead of subquery joins
For queries with includes, we currently generate joins with a subquery:
SELECT [b].[Id], [b].[Name], [b].[UserId], [t].[Id], [t].[BlogId], [t].[Description], [t].[UserId], [t].[Id0], [t].[Created], [t].[Hash], [t].[IsDeleted], [t].[Modified], [t].[PostId]
FROM [Blog] AS [b]
LEFT JOIN (
SELECT [p].[Id], [p].[BlogId], [p].[Description], [p].[UserId], [p0].[Id] AS [Id0], [p0].[Created], [p0].[Hash], [p0].[IsDeleted], [p0].[Modified], [p0].[PostId]
FROM [Post] AS [p]
LEFT JOIN [PostInstance] AS [p0] ON [p].[Id] = [p0].[PostId]
) AS [t] ON [b].[Id] = [t].[BlogId]
ORDER BY [b].[Id], [t].[Id], [t].[Id0]
We could simplify this to:
SELECT [b].[Id], [b].[Name], [b].[UserId], [t].[Id], [t].[BlogId], [t].[Description], [t].[UserId], [t].[Id0], [t].[Created], [t].[Hash], [t].[IsDeleted], [t].[Modified], [t].[PostId]
FROM [Blog] AS [b]
LEFT JOIN [Post] AS [p] ON [b].[Id] = [p].[BlogId]
LEFT JOIN [PostInstance] AS [p0] ON [p].[Id] = [p0].[PostId]
ORDER BY [b].[Id], [p].[Id], [t].[Id0]
We should measure the execution perf difference between the above two. Even if there is no (significant) difference, we could still decide to do this for SQL simplicity.
Originally raised in https://github.com/aspnet/EntityFrameworkCore/issues/17455.
Is there any way for a provider writer to override this?
I'm developing an EF 2.2 provider for an older database, and it doesn't support subqueries in a join clause at all. So currently the generated SQL is invalid.
In my case, I'm just executing the git BuiltInDataTypesBase test:
var entity = context
.Set<StringKeyDataType>()
.Include(e => e.Dependents)
.Where(e => e.Id == "Gumball!")
.ToList().Single();
That generates this SQL statement:
SELECT "e.Dependents"."Id", "e.Dependents"."StringKeyDataTypeId"
FROM "StringForeignKeyDataType" "e.Dependents"
INNER JOIN(
SELECT "e0"."Id"
FROM "StringKeyDataType" "e0"
WHERE "e0"."Id"=N'Gumball!'
) AS "t" ON "e.Dependents"."StringKeyDataTypeId"="t"."Id"
ORDER BY "t"."Id"
However it is invalid for the particular DB vendor, and it must instead be:
SELECT "e.Dependents"."Id", "e.Dependents"."StringKeyDataTypeId"
FROM "StringForeignKeyDataType" "e.Dependents"
INNER JOIN ("StringKeyDataType" "t")
ON "e.Dependents"."StringKeyDataTypeId"= "t"."Id"
WHERE "t"."Id"=N'Gumball!'
ORDER BY "t"."Id"
I've been digging into the code, and it's hard to find much information on how to change the query generation engine at that level.
Should I open a separate question for this?
@Gwindalmir your LINQ query doesn't produce a subquery for me, either on 2.2 and on 3.1:
Repro for 2.2
class Program
{
static void Main(string[] args)
{
using var ctx = new BlogContext();
ctx.Database.EnsureDeleted();
ctx.Database.EnsureCreated();
var results = ctx
.Set<StringKeyDataType>()
.Include(e => e.Dependents)
.Where(e => e.Id == "Gumball!")
.ToList();
}
}
public class BlogContext : DbContext
{
public DbSet<StringKeyDataType> StringKeyDataTypes { get; set; }
#pragma warning disable 618
public static readonly LoggerFactory ContextLoggerFactory
= new LoggerFactory(new[] { new ConsoleLoggerProvider((_, __) => true, true) });
#pragma warning restore 618
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0")
.EnableSensitiveDataLogging()
.UseLoggerFactory(ContextLoggerFactory);
}
public class StringKeyDataType
{
public string Id { get; set; }
public List<Dependent> Dependents { get; set; }
}
public class Dependent
{
public string Id { get; set; }
}
Repro for 3.1
class Program
{
static void Main(string[] args)
{
using var ctx = new BlogContext();
ctx.Database.EnsureDeleted();
ctx.Database.EnsureCreated();
var results = ctx
.Set<StringKeyDataType>()
.Include(e => e.Dependents)
.Where(e => e.Id == "Gumball!")
.ToList();
}
}
public class BlogContext : DbContext
{
public DbSet<StringKeyDataType> StringKeyDataTypes { get; set; }
static ILoggerFactory ContextLoggerFactory
=> LoggerFactory.Create(b => b.AddConsole().AddFilter("", LogLevel.Information));
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0")
.EnableSensitiveDataLogging()
.UseLoggerFactory(ContextLoggerFactory);
}
public class StringKeyDataType
{
public string Id { get; set; }
public List<Dependent> Dependents { get; set; }
}
public class Dependent
{
public string Id { get; set; }
}
Out of curiosity, which database are you trying to develop for? This issue is about removing a subquery join in a very particular case, but there are quite a few others where doing so isn't possible. Subquery joins are a standard SQL feature, and a database which doesn't support them is likely to have many issues as an EF Core relational provider...
Finally, note that EF Core 2.2 is no longer supported - 2.1 and 3.1 are the current long-term support versions. Any new development should probably happen against 3.1.
Just to answer how to do it, add a custom implementation IQueryTranslationPostprocessor deriving from RelationalQueryTranslationPostprocessor and replace ShapedQueryExpression.QueryExpression which would be a SelectExpression with a different SelectExpression to generate same result without subquery joins. If you find any lacking APIs to make required change, then another option is to provider custom IQuerySqlGenerator which will just simplify subquery join to table join when printing it out to DbCommand text.
Thanks, at the time I started, 3.1 wasn't released, and supporting .NET Framework is a requirement, so I went with 2.2.
I'm not sure why you don't see it, as the SQLite driver included in this source constructs the same query. I downloaded the release/2.2 tag as my reference point.
As for the DB in question, I'm not sure I should reference it, as I work for the company that makes it. I will say it supports primarily SQL-92 standard, with a few SQL-99 additions.
@Gwindalmir the best way would be to open a new issue and include a short, runnable code sample with SQLite that shows it happening.
If you're still in development, I'd strongly recommend considering switching to 3.1 - it's the LTS version for years to come, whereas 2.2 is already out of support.
@Gwindalmir I don't see the subquery with a single include or ThenInclude, using SQLite. It took two ThenIncludes for me to generate the subquery (see the example in issue #19418 linked above). That was with .Net Core 3.1.
@Gwindalmir I don't see the subquery with a single include or ThenInclude, using SQLite. It took two ThenIncludes for me to generate the subquery (see the example in issue #19418 linked above). That was with .Net Core 3.1.
I'm going to migrate to 3.1, and test again. If the issue is resolved there, then that's great. If not, I'll open a new issue here. Thanks for the help everyone!
Just as a follow-up, in case anyone else has the same problem: Upgrading to EF 3.1 solved the issue!
Good to hear, thanks @Gwindalmir.
Any updates on an ETA for the original issue in this thread to be resolved? :)
@Webreaper no update at the moment - this issue is "consider-for-next-release", which means it's a stretch goal for 5.0. While it's considered important, we don't think it's as important as the other issues have have triage into the 5.0 milestone (but it may still get done).
Totally understand. Thanks for the update! Looking forward to .Net 5!
Errm, looking forward to this in .Net 6? ;)
This is a 6-monthly reminder - my queries are taking 950ms when they could be taking under 200ms due to having to workaround this bug. Any chance of a fix in .Net 6 previews 6-10?
@Webreaper this is still in the plan for EF Core 6.0, I do hope we'll manage to get it in.
Great! Thanks!
Will this cover https://github.com/dotnet/efcore/issues/20758#issue-607360242 or is this a more specific case? (Note the title of that issue a bit wrong, the top level projection still selects the right columns, the sub queries just make it appear like it’s selected more)
Will this cover #20758 (comment)
I doubt it. That looks like something very different.
I disagree that they’re very different. Both are fundamentally about the potential to generate table joins instead of sub query joins. I would happily accept they have different root causes that require different fixes, however I still think it’s a valid question for an EF Core team member.
Edit: this item https://github.com/dotnet/efcore/issues/21082 seems like a dupe of the one I linked to, and was identified as a possible dupe of this one.
Oh, I see, I thought it was just about the extra columns, but I see that the sub-query is doing a full select and then a join, rather than a table join, so yes, it could be similar/related.
#20758 is different issue from this one. Subquery join vs table join is only equivalent without side effects when subquery doesn't have any additional operators. If you look at the example above then subquery has just join so eventually there will be join between all the tables.
When subquery has a where predicate applied like in global filters then both kind of joins are not same and can have different characteristics. If the filter is reducing the size of the right source by large amount then subquery join may end up working faster. Perf of such query needs to be studied differently from this issue.
Regardless of joins, issue #20758 also captures the global query filter aspect of it. If all your entities have soft-delete filter then if they are consistent graph with both filter values then, filter would need to be applied only one of the table, rest will automatically filtered out by joins.
Based on the tweets and RC1 issue, I'm guessing this is gonna roll over to 7 now?
@Webreaper this has already been punted from the 6.0 release (see the milestone).
If you could post actual perf numbers showing the difference between the two queries, that could help push forward the priority of this - we haven't yet had time to properly investigate that.
Ah, missed that, thanks.
I posted some comparative numbers in the issue I raised about this before I found it was a dupe of this one: https://github.com/dotnet/efcore/issues/19418
With 500,000 entity rows rows and a ThenInclude onto a table with 1.2m entries, the unfiltered select pulls in all 1.2 million rows before filtering them, so the query takes around around 3 seconds (on my M1 MacBook pro). Manually adjusting the SQL to use a proper table join, drops the query time to less than 300ms. So it's an order of magnitude faster when table joins are used.
Since I raised that issue, the DB I'm using now has over 2 million rows in the ImageTags table, which means that the query is likely to take 3-5s to run. With the table-joins issue fixed, this should remain constant at around 300-350ms or less, depending on indexes. So it's quite a big difference.
Please let me know if you need any more details.
@Webreaper looking at #19418 again, you seem to be comparing the EF-generated single query to split query (which isn't relevant here), as well as to a query where there's no join with the Tags table. So in the context of this issue, we don't have the exact same query, once written as a join of a subquery and once as a table join... Ideally we'd have the two side-by-side as runnable code samples, with clearly different runtime numbers.
The important thing is to understand whether this is something that really does affect perf, or whether (most) databases can optimize this in any case, in which case it's just a SQL simplification issue (and the priority is lower).
Okay, so here's the query that gets generated by EF today:
SELECT "b"."ImageId", "b"."DateAdded", "i"."ImageId", "f"."FolderId"
FROM "BasketEntries" AS "b"
INNER JOIN "Images" AS "i" ON "b"."ImageId" = "i"."ImageId"
INNER JOIN "Folders" AS "f" ON "i"."FolderId" = "f"."FolderId"
LEFT JOIN (
SELECT "i0"."ImageId", "i0"."TagId", "t"."TagId" AS "TagId0", "t"."Keyword"
FROM "ImageTags" AS "i0"
INNER JOIN "Tags" AS "t" ON "i0"."TagId" = "t"."TagId"
) AS "t0" ON "i"."ImageId" = "t0"."ImageId"
ORDER BY "b"."ImageId", "i"."ImageId", "f"."FolderId", "t0"."ImageId", "t0"."TagId", "t0"."TagId0"
The query below is what should, IMO, be generated by EFCore (this one includes the 'Tags' join that was missing previously):
SELECT "b"."ImageId", "b"."DateAdded", "i"."ImageId", "f"."FolderId"
FROM "BasketEntries" AS "b"
INNER JOIN "Images" AS "i" ON "b"."ImageId" = "i"."ImageId"
INNER JOIN "Folders" AS "f" ON "i"."FolderId" = "f"."FolderId"
LEFT JOIN "ImageTags" AS "i0" ON "i"."ImageId" = "i0"."ImageId"
LEFT JOIN "Tags" as "t" on "i0".tagId = t.TagID
ORDER BY "b"."ImageId", "i"."ImageId", "f"."FolderId", "i0"."ImageId", "i0"."TagId"
If I run the first query, it takes over 10 seconds to execute. The second query returns instantly (i.e., significantly sub-second). They're doing the same thing, and using the same set of data, and return identical results. That seems like a pretty significant performance issue to me.
It seems pretty clear that if you left-join on an unfiltered select statement that returns all rows from another table with millions of rows, only to then immediately filter out 99.999% of the rows with the as/on clause of the join, that it's going to be massively slower than it needs to be.
I'm currently using the split-query approach to work around this bug, but that's still slower than the 'proper' table-join query (i.e., it requires 2 x 200ms instead of 1 x 200ms because there's two queries being run). I'd really like to remove that hack, but I can't because a user searching for an image in the app would have to wait over 10s for search results to come back.
Hopefully the performance numbers I've quoted will convince you. I'd hope you guys can knock up a sample and repro this pretty trivially, and it seems like the left-join-subselect SQL above is glaringly obviously wrong. But if the only way it's going to get prioritised is for me to generate a sample app, I'll try and find the time (but I really hope you can just fix it without me needing to do that :)).
@Webreaper note that the first query has an inner join on Tags, whereas the latter has a left join. I'm not saying that makes a huge difference, but it's better to be sure about these things (BTW this is Sqlite right? Any similar experience with another database, just in case this is a Sqlite thing?). A database schema with data for this would be ideal to reproduce.
But I generally agree, and yeah, at some point for EF 7 I'll definitely find the time to investigate this properly across database...
I haven't tried with another DB; I started adding support for Postgres to play with that, but haven't got it working properly yet. I guess I don't really care if another DB is smart enough to optimise this out, as I'm using Sqlite. ;)
If it helps to understand what's going on, here's my DbContext: https://github.com/Webreaper/Damselfly/blob/master/Damselfly.Core/Models/ImageContext.cs
I will try and see if I can pull out the salient points from the data model, knock up some test data, and stick a test app into github at some point, but it might not be for a couple of weeks. I know that would make it much easier for you guys to delve into it....
Thanks for your efforts @Webreaper, much appreciated!
Okay, finally put this repro together. https://github.com/Webreaper/EFCore6TableJoinBug
I've tried to document it as clearly as possible in the README, but if you have any questions at all about what my code is doing or why, please comment here (or raise an issue on the repo, or email me at [email protected]).