efcore
efcore copied to clipboard
Parameterized Offset/Fetch makes query slower in batch mode
scenario
Recently I created an column store index to improve the performance. It was fine if I directly specify the input of Offset/Fetch. However, I found that when I executed the sql generated by EF which is parameterized, the performance got almost 10 times slower than the former. It could be a disaster if there are extra entities to join.
I wonder if there is an solution for my scenario or maybe is that an opportunity for EF? The Skip()/Take() only accept the int parameter, so I guess it is relatively safe and might be unnecessary to parameterize the input for Offset/Fetch.
samples
Parameterized Offset/Fetch with 1 row
Non-parameterized Offset/Fetch with 1 row
Non-parameterized Offset/Fetch with 900 row
version information
EF Core version: 3.1 Database provider: SQL Server in Azure
@Singway thanks for this information.
Interestingly, the costs below seem identical for parameterized vs. constant, so even if the execution plan differs, I'm wondering if there's an actual difference. In other words, it makes sense that at planning time, the planner has more information in the constant case, since it knows up-front how many rows will be returned and doesn't require the parameter values of a specific query; but whether that actually impacts query performance is another question.
Any chance you could benchmark the two?
FYI the reason we parameterize Skip/Take is that if we don't, every differing value would cause a completely different query plan to be generated (so Skip(1) and Skip(2) would result in two separate plans); that can create its own performance problems.
@roji thank you for your kind reply. Sorry maybe I didn't demonstrate it clearly. It actually had an actual difference. Let's focus on the first and the second plan, we would end up finding that the elapses respectively were 1.506s and 0.167s with parameter and constant.
And you are right the SQL Server optimizer utilize those up-front info like statistics to adjust the execution plan.
I realized the reason why EF needs to parameterize those input but it dose impact the performance sometimes(at least in my case). I'm wondering if there will be a config that we could to decide whether to use the parameter.
@Singway thanks for the additional info.
I'd like to investigate this, but we're currently heads down stabilizing the 7.0 release. Could I ask you to put together a minimal repro that shows the above happening, i.e. a simple table schema, some data, and the two queries with differing performance characteristics?
@Singway thanks for the additional info.
I'd like to investigate this, but we're currently heads down stabilizing the 7.0 release. Could I ask you to put together a minimal repro that shows the above happening, i.e. a simple table schema, some data, and the two queries with differing performance characteristics?
Sure. I've push the sample to a Repro. You could import the data with Backup.bacpac in the Folder Backup and then execute the script in the folder Script. The schema is not same as the screenshot I posted at first but it do the same thing which is extracting the clustered key with opertor Offset/Fetch. Hip and thx.
/cc @roji
@Singway I just tried to access the repro link you posted above, but I'm getting a 404. Can you please check and post the right link?
Note suggestion in #31552 to re-introduce overloads of Skip/Take accepting a lambda, so that we can differentiate between parameters and constants. This is similar to how EF6 worked apparently, except that a non-lambda call to Skip/Take was interpreted as a constant rather than as a parameter.
Constants vs Parameters is not just about Skip/Take. Also a constant in Where() clause can make a big difference. In my case 2s vs Timeout (30s)
@Liero Where accepts a a lambda expression, so EF sees an expression tree. This means that you can yourself control whether to use a lambda of a parameter:
// Constant:
_ = context.Blogs.Where(b => b.Name == "foo");
// Parameter
var name = "foo";
_ = context.Blogs.Where(b => b.Name == name);
Skip and Take, on the other hand, accept a value rather than a lambda, so there's no expression tree and no distinction between constant and parameter as above. This is why this problem is specific to Skip and Take (and Contains, incidentally).
Yeah, but that is not sufficient.
I want this
var name = "foo";
_ = context.Blogs.Where(b => b.Name == name);
to be translated as constant.
I currently have to write:
_ = name switch
{
"foo" => context.Blogs.Where(b => b.Name == "foo");
"bar" => context.Blogs.Where(b => b.Name == "bar");
...
}
I would prefer
context.Blogs.Where(b => b.Name == EF.Constant(name));
@Liero you can also use the Expression builder API to dynamically construct the expression tree predicate passed to Where (i.e. use Expression.Constant); that should be an easier and much more robust way to integrate constant nodes in a part of the query.
However, we can use #31552 to track EF.Constant (or similar) as a nicer way to produce constant nodes without doing the above; that would be quite a specialized (and somewhat obscure) query feature, but I guess it makes sense.
Note that something like EF.Constant would not be usable with Skip/Take as they exist today, since as I wrote above, these accept an int parameter and not an expression tree; that means that if you put EF.Constant inside Skip/Take at the top-level query, it would simply get immediately evaluated and the result passed to Skip/Take. There's simply no way to make a constant vs. parameter distinction for Skip/Take (at least not when these methods are you used in the top-level, rather than nested within a lambda within the querY).
Just to summarize the different possibilities here:
- An EF.Constant() mechanism (#31552) to force constantization doesn't work, since Skip/Take don't accept a lambda and would there just evaluate it (see comment just above).
- We could add queryable-only overloads of Skip and Take that accept lambdas, and translate those. Note that in theory we'd want to do that for all operators that don't accept a lambda, e.g. Contains, ElementAt... (though are less likely to be used at the top-level etc.).
- We could add a general-purpose
EF.Query()
mechanism, which accepts a lambda and allows specifying arbitrary queries within. This would simply be a mechanism to express a query in a nested expression-tree context, so that the parameter/constant distinction would be preserved. This may be useful for other scenarios as well (not sure if we have an issue tracking it). - We have plans for precompiled queries/AOT support, where we statically analyzer user code to identify LINQ queries, convert these from Roslyn syntax/semantic trees to LINQ, and then perform compilation at that point. Since we have access to the original Roslyn representation, we can also know whether something is a constant or parameter.