EntityFramework.Docs icon indicating copy to clipboard operation
EntityFramework.Docs copied to clipboard

Nested projections via AsQueryable...ToList and Expression.Invoke

Open syndicatedshannon opened this issue 4 years ago • 39 comments
trafficstars

It appears that EF Core 5 generates correctly-related (server side) SQL queries in the following two nested projections:

public static IEnumerable<BookViewModel> ProjectBooks(IQueryable<WorkOrderResource> books)
{
    Expression<Func<AuthorViewModel, AuthorStorageModel>> authorProjection = l => new AuthorViewModel { };
    return books.Select(b => new BookViewModel
    {
        Author = Expression.Invoke(authorProjection, b.PrimaryAuthor),
        Authors = b.Authors.AsQueryable().Select(authorProjection).ToList()
    };
}

Why does this work? Would you please point me to related documentation?

This was not intuitive to me. My expectation from past work with EF would be that both .Compile() and .ToList() would break EF Expressions.

I would especially like to confirm this is a supported feature. I was unable to find unit tests for this in the repo.

Previously, to address this need, walking ExpressionVisitor with code such as https://github.com/scottksmith95/LINQKit was required.

EF Core version: 5.0 Database provider: Oracle Target framework: .NET 5.0 Operating system: Win10 IDE: Visual Studio 2019 16.3

syndicatedshannon avatar May 27 '21 17:05 syndicatedshannon

As exception message indicates that query contains result of type IQueryable<QueryEntry> which is not allowed in final result.

The first 2 scenario you mentioned, first one does convert it to list by using ToList method and in second one navigatedEntity is not queryable type so final result is of enumerable type.

smitpatel avatar May 27 '21 20:05 smitpatel

My question must have been poorly worded. I'm not concerned about the Exception; I'm only looking for documentation on this feature.

I will edit the question above to remove discussion of the Exception, and move it below.

syndicatedshannon avatar May 27 '21 21:05 syndicatedshannon

Note that if you remove the "ToList()" in the example above, it no longer works, and EFCore throws an exception, recommending the expression be updated. Therefore, it appears this is an intended feature, but I can't find it documented anywhere.

System.InvalidOperationException: The query contains a projection '

   s => s.Entries
      .AsQueryable()
      .Select(q => new QueueEntry{})

' of type 'IQueryable<QueueEntry>'. Collections in the final projection must be an 'IEnumerable<T>' type such as 'List<T>'. Consider using 'ToList' or some other mechanism to convert the 'IQueryable<T>' or 'IOrderedEnumerable<T>' into an 'IEnumerable<T>'.

syndicatedshannon avatar May 27 '21 21:05 syndicatedshannon

I've made a number of edits to the question that I hope clarify it. Would you please review before closing this question as resolved?

syndicatedshannon avatar May 27 '21 23:05 syndicatedshannon

By default EF Core will try to translate every query it can. So there is no documentation for everything which is translated. The only error is when materializing IQueryable as indicated in error message.

  • The Author part works because it is single result.
  • The Authors part work when it has ToList since you converted the source to Queryable using AsQueryable. Note that it is not about what is type of enumerable in the subquery exactly, by default when materializing enumerable EF Core always generates a list or concrete type of collection navigation when information is present. The issue happens without ToList that type of Authors in BookViewModel becomes IQueryable and we cannot materialize IQueryable objects.
  • Compile.Invoke works for everything as long as the object being compiled is type of Expression<Func<>>

At the end of day it is all about what expression tree EF gets and what object it has to create. objects of IQueryable types are not possible to create and it throws exception with providing guidance what to do.

smitpatel avatar May 28 '21 00:05 smitpatel

Thank you, that does help clarify some things. I'd like to clarify "translate every query it can" if you don't mind.

If I understand you correctly, Expression<Func<>> containing IQueryable.Select.ToList and Compile.Invoke is working by design, and it is safe/supported for me to replace my usage of LinqKit with this approach today? I think I hear you saying it's OK to do this, and my code won't break tomorrow.

Also I know it's a broad statement, but I think it would really be helpful if the EFCore documentation contained a topic with examples on best practices for code/expression reuse and nested projections.

From my perspective as a developer, I wonder how I would have discovered EF Core supports this today, given it didn't work in EF or early EF Core versions and I don't believe it was announced in release notes? I wonder if I'm still missing other projection tricks to clean up my code that aren't just a question of software development experience, but rather "what EF Core can currently translate".

syndicatedshannon avatar May 28 '21 00:05 syndicatedshannon

LINQ is an infinite problem space. Unlimited numbers of queries can be written by combing various different LINQ operators hence it makes impossible for us to define exactly what is supported. e.g. A normal Join operation is supported but you cannot use a client side collection (rather than a server correlated table) in join or you cannot use a client side function in join key. Which makes it very hard to define list of supported things compared to list of unsupported things. So if you write a query and it works then it is supported. When things don't work, we try to do one of the following

  • Throw a useful error message indicating what users should do to remedy it. Above exception message is one of those kind where user can change their query to resolve the problem.
  • Throw an error message with details why certain things cannot be translated and what is reason behind it. e.g. GroupJoin operator without SelectMany flattening. It doesn't have a server equivalent and there is no easy remedy. User would need to think how do they want to rewrite query to get results they want without using GroupJoin. We have docs for this here https://docs.microsoft.com/en-us/ef/core/querying/complex-query-operators#groupjoin
  • Throw error message for features which is not implemented yet but we plan to support it in future. The error message either provide reference to tracking issue or says not supported yet.
  • A hard to interpret error message when basically there is a bug in code base.

For things where user needs to rewrite query but is not trivial we do link to docs in exception message. For not yet implemented feature, tracking issues kinda serves as docs page.

So if you write a query and it works then it is supported and it will continue to be supported in future versions (especially if you are using version 3.1 or higher).

For best practices, we have docs https://docs.microsoft.com/en-us/ef/core/performance/efficient-querying which talks about performance specifically. Apart from that how to write query is personal preference of developer. As a query engine, what we see is an expression tree. How it is created doesn't matter to query engine so whichever way it is easy for the user to do it they are free to do so.

smitpatel avatar May 28 '21 20:05 smitpatel

Thank you. Based on your answer "if you write a query and it works then it is supported and it will continue to be supported in future versions." Because this works today it is supported.

Based on this I agree that EF Core itself is an infinite problem space by design, and this strategy is probably what led to my confusion.

I expected that behind the scenes the EFCore team defined specific supported scenarios in order to create a finite problem space, and I thought I would find that Compile...Invoke was added around version 3 but simply not added to documentation.

Still, I would love the specific topic of nested expressions to be mentioned in documentation. I think the means to do this are non-intuitive and I bet many developers would find it helpful to discover there is a way to do it. Per your exempli gratia regarding client side collection being supported or not, that is included in documentation, largely because it violates principle of least surprise... not because of an EF Core flaw, but just because it's not intuitive.

Regardless, this does answer my question, thank you so much.

syndicatedshannon avatar May 29 '21 02:05 syndicatedshannon

Please kindly consider my request for documentation on the topic.

syndicatedshannon avatar May 29 '21 02:05 syndicatedshannon

Yeah, better documentation would be useful. I understand it's really hard to document everything possible, but anyway.

I was surprised to discover that I'm able to use methods invocations in projections in some cases in EF Core 5. E.g.

    return DbContext.Books.Select(b => new BookViewModel
    {
        Author = MyNonExpressionMethod(b.Author.Name),
        Title = b.Title
    };

(It simply retrieves b.Author.Name from DB, and the executes MyNonExpressionMethod on each value after actual DB call).

But I wasn't able to find any docs or even release notes when that was done.

Dreamescaper avatar Jun 03 '21 18:06 Dreamescaper

@Dreamescaper - That is not invocation in pure expression tree terms since there is not Invoke method. EF Core has allowed client eval in top level projection since very first version. We have extensive documentation regarding that https://docs.microsoft.com/en-us/ef/core/querying/client-eval

smitpatel avatar Jun 03 '21 20:06 smitpatel

cc: @roji

AndriySvyryd avatar Jun 04 '21 17:06 AndriySvyryd

Just for my 2 cents, I'm generally skeptical of documenting which constructs are translatable and which aren't, for the same reason @smitpatel detailed above in https://github.com/dotnet/efcore/issues/24991#issuecomment-850654091... I generally believe that when a query cannot be translated, our exception message should be good enough to convey what the problem is, and point the user in the right direction.

For example, if a user attempts to project IQueryable out, we currently throw an exception with the following message:

The query contains a projection 'b => b.Posts
    .AsQueryable()' of type 'IQueryable<Post>'. Collections in the final projection must be an 'IEnumerable<T>' type such as 'List<T>'. Consider using 'ToList' or some other mechanism to convert the 'IQueryable<T>' or 'IOrderedEnumerable<T>' into an 'IEnumerable<T>'.

So it seems like all the relevant information is being provided to the user. Adding a documentation page in addition to that doesn't seem very valuable; after all, users are likely to be interested in this topic only if they try to project an IQueryable - I'm not sure who it would help to have a long list of patterns which we can't translate.

roji avatar Jun 04 '21 18:06 roji

@roji If your comment is regarding my request for documentation, I think my request may have been misunderstood. I'm asking to document something that does work, not something that doesn't.

I agree that in the case of AsQueryable, it wouldn't be helpful to document that it doesn't work without calling ToList(). The issue is I would never have considered calling .AsQueryable() in the first place, because on most occasions that just misleading. It's just not intuitive.

Even moreso for .Compile...Invoke. Who would think that to get an expression to run at the database layer I would compile it? It's just not something a developer would expect, in my opinion.

So I'm suggesting that a section be added to documentation that suggests some working patterns that can be used to accomplish Expression nesting, perhaps covering examples in both projections and predicates.

Hope that clarified, please feel free to follow up with me here or directly.

syndicatedshannon avatar Jun 04 '21 18:06 syndicatedshannon

I believe both the Invoke...Compile and AsQueryable scenarios are capabilities that were intentionally added after the first release. I believe they took some effort to implement, and were added/merged because they are valuable features.

I haven't looked at all the history, but I do know that one feature was accidentally regressed in some version, and then later fixed as a defect. But I think it's still a feature.

It would be helpful to users of the library to know the feature exists.

syndicatedshannon avatar Jun 04 '21 18:06 syndicatedshannon

OK, I do seem to have misunderstood.

Even for positive examples, it's hard to see exactly what kind of documentation we'd provide, e.g. for expression nesting, given the infinite nature of LINQ. It's not that I'm fundamentally opposed to it or anything, but there's a good chance we'd end up with a long list of "this translates to that" which doesn't seem very helpful to anyone. That's why our recommendation is to understand several basic facts about translation (e.g. client evaluation is supported (almost) only in the top-most projection), and beyond that people can expressing what they want. Either it's supported and it works, or you get an informative exception telling you why.

If you still feel dedicated docs on working patterns is useful, maybe some concrete examples would be helpful.

roji avatar Jun 04 '21 18:06 roji

Re Compile specifically - there's nothing here that's actually a feature of EF as far as I can tell... You can put whatever .NET code you want in the top-most projection (last Select), and EF will simply client-evalute it. The fact that you chose to Compile an expression isn't very important here - you could just as well have directly placed a Func<...> instead - without Compile - and that would have worked.

roji avatar Jun 04 '21 18:06 roji

That's interesting, I got my Compile expression translated to SQL using an Oracle provider. If I'm mistaken or it's provider-dependent, I can totally see why you wouldn't have understood my questions. I'll be happy to verify.

That's what I meant about being used to using LinqKit (3rd party) to call an Invoke extension directly, which then expands the expression tree or compiles as appropriate.

syndicatedshannon avatar Jun 04 '21 18:06 syndicatedshannon

That's interesting, I got my Compile expression translated to SQL using an Oracle provider.

So to be on the same page - there's a big difference between saying a query executed successfully, and saying that the query got translated to SQL (in its entirety). I very much doubt that the Oracle provider translated .NET's Compile invocation to SQL - what possible SQL construct could exist for that? So I'm guessing that you mean that your query executed successfully, which simply means that Compile was client-evaluated by EF Core, in .NET (just like anything else could be).

This is documented in our client vs. server evaluation docs, hopefully that makes things clear.

roji avatar Jun 04 '21 18:06 roji

I appreciate that @roji . No, I don't mean executed successfully; I mean the nested projection operations appears in SQL terms. I'm certainly not offended. It's commonly poorly understood, appreciate you making sure I understand.

Interesting related story, when I saw the Compile nested in an Expression, I at first had rejected a code review... until the developer demonstrated that specific portion of the nested projection was being translated to SQL.

I'll attempt to reproduce that in the next day or two and provide details.

syndicatedshannon avatar Jun 04 '21 18:06 syndicatedshannon

"what possible SQL construct could exist for that"

And that brings me to the crux of my request for documentation. It would be similar to the process for IQueryable. What could convert a List to an IQueryable? Nothing in this context, but the magic would be in the tree taken as a whole. Compile...Invoke absolutely COULD be correctly interpreted.

syndicatedshannon avatar Jun 04 '21 18:06 syndicatedshannon

I hear you saying you don't think it is. If it's not, then I also agree it doesn't merit documentation.

However, I could ask the same question you just did about AsQueryable. How could that possibly work? It can, but it's not intuitive. Therefore, documentation :D

syndicatedshannon avatar Jun 04 '21 19:06 syndicatedshannon

Th history of Compile().Invoke() structure. When you put Compile...Invoke inside a lambda expression in an expression tree world, it is never compiled, rather the Compile method call is introduced in expression tree along with an InvocationExpression. This is all outside of bounds of EF Core and how compiler generates expression tree. In previous versions of EF Core when it used Remotion.Linq library. An Invoke when inside expression tree can be simplified by transforming expression tree if applied on a lambda expression. Remotion.Linq did this so then we ended up with feature in EF Core which was not planned or thought about. When we decided to remove Remotion.Linq, it caused regression as reported in https://github.com/dotnet/efcore/issues/17791. While it was not about Compile/Invoke pattern the simplification of InvocationExpression. Compile/Invoke still client evals since there is no special processing for Compile in EF Core.

The query in OP can be re-written this way without using Compile/Invoke

public static IEnumerable<BookViewModel> ProjectBooks(IQueryable<WorkOrderResource> books)
{
    Expression<Func<AuthorViewModel, AuthorStorageModel>> authorProjection = l => new AuthorViewModel { };
    return books.Select(b => new BookViewModel
    {
        Author = b.Authors.AsQueryable().Select(authorProjection).First(),
        Authors = b.Authors.AsQueryable().Select(authorProjection).ToList()
    };
}

So again we are back to the original point, if it works and you like to write it that way it works. There is no specific value in this case to guide users into that direction by documenting it since it does not have any merit (only demerit of client eval)

For AsQueryable is supported for the specific scenario where it is used in above case. You want to apply Select which takes Expression<Func<>> but that overload of Select only exist on Queryables and a navigation property in C# is not a queryable but would eventually convert to a queryable when EF Core expands navigation. It is a necessary method to call to build expression trees that way then writing LINQ directly. Composing trees above way is advanced scenario and we haven't seen any particular pain point or feedback from users that they got stuck and answer to it was using AsQueryable.

smitpatel avatar Jun 04 '21 19:06 smitpatel

Looking at code for Invocation simplification, we don't do anything for Compile in our code base, it may be still client evaluated as @roji pointed out and I would be careful of using it if that is the case.

smitpatel avatar Jun 04 '21 19:06 smitpatel

I'm confused now given the discussion of Remotion.Linq whether we anticipate this will work or not. LOL

syndicatedshannon avatar Jun 04 '21 19:06 syndicatedshannon

I do think there is a lot of intrinsic value in it working. I specifically included scottksmith95/LINQKit project previously because I feel it's mandatory. I don't see other clear patterns to isolate predicate and projection business logic to single truths.

syndicatedshannon avatar Jun 04 '21 19:06 syndicatedshannon

@syndicatedshannon - What specific value Compile/Invoke provide rather than using it in lambda directly as I posted in my comment above? You are still passing the lambda from your business logic so it does isolate the lambda.

smitpatel avatar Jun 04 '21 19:06 smitpatel

OK, I did a quick test. Consider the following code (full code sample below):

Expression<Func<Blog, string>> projection = l => l.Name;

_ = ctx.Blogs
    .Select(b => projection.Compile().Invoke(b))
    .ToList();

If projection were server-evaluated, that means we'd expect to see only the Name column projected out of the database. However, running this projects both columns out:

SELECT [b].[Id], [b].[Name]
FROM [Blogs] AS [b]

From this I infer that this is pure, simple client-evaluation - EF Core doesn't know about the Compile method, so it client-evals it. If you're convinced you're seeing anything else, a real code sample and the resulting SQL would be great.

BTW note that if this is really client-evaluating, that means you're compiling a delegate for each projection, which is quite horrible for perf.

Full code sample
await using var ctx = new BlogContext();
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();

Expression<Func<Blog, string>> authorProjection = l => l.Name;

_ = ctx.Blogs
    .Select(b => authorProjection.Compile().Invoke(b))
    .ToList();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    static ILoggerFactory ContextLoggerFactory
        => LoggerFactory.Create(b => b.AddConsole().AddFilter("", LogLevel.Information));

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0")
            .EnableSensitiveDataLogging()
            .UseLoggerFactory(ContextLoggerFactory);
}

public class Blog
{
    public int Id { get; set; }
    public string Name { get; set; }
}

roji avatar Jun 04 '21 19:06 roji

@roji "which is quite horrible for perf" Understand. Thus the initially-rejected code review in my story above.

syndicatedshannon avatar Jun 04 '21 19:06 syndicatedshannon

@syndicatedshannon so please try to get together an actual code sample around - I'm guessing there's some sort of confusion here somewhere, but I'd be happy to be proven wrong.

roji avatar Jun 04 '21 19:06 roji