efcore icon indicating copy to clipboard operation
efcore copied to clipboard

Performance degraded when migrated from EF2 to EF6

Open EvgenyMuryshkin opened this issue 3 years ago • 23 comments
trafficstars

File a bug

Performance degraded when migrated from EF2 to EF6

Include your code

Please see repository for test cases and database setup, + test cases for AsSplitQuery and EFPlus Optimized query

https://github.com/EvgenyMuryshkin/EFCorePerf

Include provider and version information

EF Core version: 6 Database provider: (Microsoft.EntityFrameworkCore.SqlServer) Target framework: (.NET 6.0) Operating system: Windows 10 Pro IDE: (Visual Studio 2022)

EvgenyMuryshkin avatar May 03 '22 11:05 EvgenyMuryshkin

@EvgenyMuryshkin I'd be happy to take a deeper look, but I noticed you've implemented your benchmark as unit tests, without any consideration to warm-up, how to determine iteration counts, or various other benchmarking considerations. For example, the first test that happens to run will perform all the cold-start work, and appear to work much slower than the second test (this is because of the lack of warmup).

I highly recommend doing your benchmarks with BenchmarkDotNet, which takes care of all these problems.

roji avatar May 03 '22 14:05 roji

@roji I have updated repository with benchmarks and test results, please have a look.

This is kind of the key difference. I found that in EF2 first DB call is also slow (EF warmup, SQL execution plan), but then if works fine for subsequent requests with different query parameters.

In EF6 - each call is slow, fast call is only for query with the same parameters.

Thanks, Regards, Evgeny

EvgenyMuryshkin avatar May 03 '22 23:05 EvgenyMuryshkin

@EvgenyMuryshkin thanks for making the change to BenchmarkDotNet.

Looking at the benchmarks, you seem be doing a large amount of collection joins using a single query - this causes the so-called "cartesian explosion" problem, and it's expected for this to run slowly. We typically recommend switching to split query for this kind of scenario (see this section in our docs).

Now, I see that in your summary you address split query, but say that it "snailed along". The SQL query right below doesn't seem to be a split query though, and I can't see any actual benchmark results for that - can you please update your benchmark code and results to use split query, and post the SQL outputted from it?

roji avatar May 04 '22 08:05 roji

@roji ef6 split query test runs for 13 minutes, comparing to 7 seconds for ef2. Do you really need benchmark for this? I did not paste all split queries, only one that is causing the problem.

I understand cartesian, ef2 was able to handle it without problems. As I remember, problem first appeared in ef3, but I hold upgrade for as long as I could. Now as .net 3 is running out of support, we forced to upgrade and ef issue is still there.

@roji I updated readme with full AsSplitQuery log (6 large queries in total for AsSplitQuery)

Thanks, Regards, Evgeny

EvgenyMuryshkin avatar May 04 '22 08:05 EvgenyMuryshkin

Do you really need benchmark for this?

Well, split query is what you're supposed to be using in EF Core 3+ when many collection includes are present - so it makes sense to benchmark that.

I understand cartesian, ef2 was able to handle it without problems. As I remember, problem first appeared in ef3, but I hold upgrade for as long as I could. Now as .net 3 is running out of support, we forced to upgrade and ef issue is still there.

EF Core 2 did not perform single query (JOINs) for collection includes, it performed a form of split query. Single query was introduced in EF Core 3.0.

@smitpatel can you take a look here? IIRC the EF Core 3+ split query isn't identical to what we were doing before 3, maybe that difference is causing a comparative slowness here?

roji avatar May 04 '22 09:05 roji

@roji sure, will create benchmarks for split and optimized.

Thanks, Regards, Evgeny

EvgenyMuryshkin avatar May 04 '22 09:05 EvgenyMuryshkin

@roji I have updated readme with benchmarks for AsSplitQuery.

Cannot reproduce stall in benchmark, only managed to see that during unit test run, unfortunately - effectively ~8 seconds per query (see section below EF6 benchmark results).

https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/AsSplitQuery.stall.txt

Thanks, Regards, Evgeny

EvgenyMuryshkin avatar May 04 '22 11:05 EvgenyMuryshkin

@EvgenyMuryshkin are you saying that in the benchmark, the EF Core 6 split query performance is completely fine (comparable to what it was in EF Core 2)? If so, there may be some issue with the way your tests are set up (or in your actual application), or some interference between them that explains the slowdown. I'd advise concentrating on reproducing the perf issue in the benchmark - that may help you find the actual issue in your application.

roji avatar May 04 '22 12:05 roji

@roji it is still a lot slower than ef2, 12ms vs 260ms

EvgenyMuryshkin avatar May 04 '22 12:05 EvgenyMuryshkin

OK, thanks. Probably good for @smitpatel to take a look at the generated SQLs.

roji avatar May 04 '22 18:05 roji

Looking at the query ShippingUnitsWithComposites There are 5 collections in the query. Looking at logs in readme file, EF6 split query generates 6 queries which is expected. But EF2 only generated 2 queries (even with split mode). I still suspect we are measuring same thing here. If the performance issue is there then it would be easier to trim down this to much smaller code rather than having 20+ includes. Lesser number of include should still show difference.

smitpatel avatar May 05 '22 00:05 smitpatel

@smitpatel Query #4 in EF6 Split query is the most time and resource consuming from SQL profiling tool. I have added markers into readme

EvgenyMuryshkin avatar May 05 '22 01:05 EvgenyMuryshkin

@smitpatel I got the idea that query is complex. Question here really, is there something can be done from EF side to match performance of EF2. Or it will stay like this and we have to try and find workarounds apart from using of AsSplitQuery

Thanks Regards, Evgeny

EvgenyMuryshkin avatar May 05 '22 01:05 EvgenyMuryshkin

@smitpatel maybe I can somehow replicate EF2 split logic? I tried to add multiple AsSingleQuery() and AsSplitQuery() into the same WithComposites, but it seems to be picking up only last modifier and apply to the whole query

EvgenyMuryshkin avatar May 05 '22 01:05 EvgenyMuryshkin

That doesn't address my observation above. I am not sure if we are comparing exactly same query between EF2 vs EF6 here. In that case, no there is no way for a EF6 query to behave like some different EF2 query.

smitpatel avatar May 05 '22 16:05 smitpatel

@smitpatel LINQ queries are the same and generated SQL queries are different, making overall performance impact. How can we proceed from here?

EvgenyMuryshkin avatar May 05 '22 21:05 EvgenyMuryshkin

If the generated query count is not the same then they are not same. Comparing the perf of 2 DbComamnd vs 6 DbCommand, certainly the latter will likely to have higher perf. The way split query is implemented in EF3+ it issues to same number of command as EF2. The only difference is in the generated SQL which is intentional change to allow utilizing the code path for more kind of queries (specifically queries with Distinct/Skip/Take which couldn't use split query mode in EF2).

Path forward from here, get a single LINQ query which is generating same number of SQL queries, then you would be able to inspect the difference between generated SQL (intentional change I mentioned above). It will give you an opportunity to understand if the generated SQL queries are intrinsically slower or is it something EF core does from the results which is slowing down.

Even while the repro code uses BDN, the amount of code is still quite a lot of pin point down if the queries being run and results being generated are the same. You need to trim down the repro code to minimal amount for us to investigate effectively.

smitpatel avatar May 05 '22 22:05 smitpatel

@smitpatel I don't think I can give more on this. I spent last two weeks trying to pin down the problem. EF6 works fine on two table joins, it works reasonable on random seed for data in tables for that schema.

But when it comes to production - it just slow, with or without split query modifier. So I had to pack whole prod database as a test case.

Same LINQ query between versions produces very different SQL in terms of performance, looks like we need to change schema then.

Thanks Regards, Evgeny

EvgenyMuryshkin avatar May 05 '22 23:05 EvgenyMuryshkin

@smitpatel I might have an opportunity to get back on this in couple of weeks, we just run out of time. I will try to find query that produce similar sql

EvgenyMuryshkin avatar May 05 '22 23:05 EvgenyMuryshkin

@smitpatel I have added benchmarking for incremental complication of this test query.

https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/query.md

Performance is comparable up to query 48, then EF6 falls behind.

EvgenyMuryshkin avatar May 06 '22 05:05 EvgenyMuryshkin

@smitpatel I don't know what I am looking for.

EF6 produced completely different join pattern then EF2, how can I create LINQ that produces same SQL queries?

Please have a look into single include SQL comparison.

https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/diff.md

EF6 Split query looks similar to EF2, is that what you are after?

EvgenyMuryshkin avatar May 10 '22 05:05 EvgenyMuryshkin

What are the perf comparisons of EF2 query and EF6 split query?

smitpatel avatar May 10 '22 05:05 smitpatel

@smitpatel just single include, EF6 is faster (at the bottom - Distilled section), but as data gets added, EF6 falls behind, especially split query. See here https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/query.md ShippingUnitsWithComposites31AsSplitQuery and ShippingUnitsWithComposites32AsSplitQuery

At this time References are being included to query 32

EvgenyMuryshkin avatar May 10 '22 05:05 EvgenyMuryshkin