efcore
efcore copied to clipboard
JOIN instead of CROSS APPLY in generated query in SQL Server
EF Core Preview 5 would generate CROSS APPLY from a linq query like this:
from navObject in Context.NavObjects
join vessel in Context.Vessels on navObject.VesselId equals vessel.VesselId
from passage in Context.Passages
.Where(x => x.VesselId == navObject.VesselId && x.ActualDepartureTime.Value <= fromTime)
.OrderByDescending(x => x.ActualDepartureTime)
.Take(1)
.DefaultIfEmpty()
The generated query would be:
SELECT ... FROM [NavObject] AS [no]
INNER JOIN [Vessel] AS [vessel] ON [no].[ObjectId] = [vessel].[ObjectId]
CROSS APPLY (
SELECT TOP(1) [x].*
FROM [Passage] AS [x]
WHERE ([x].[ObjectId] = [no].[ObjectId]) AND ([x].[ActualDepartureTime] <= @__fromTime_1)
ORDER BY [x].[ActualDepartureTime] DESC
) AS [t]
In RC1 the query contains JOINs from SELECTs from SELECTs which cause where bad performance and timeouts:
SELECT ... FROM [NavObject] AS [n]
INNER JOIN [Vessel] AS [v] ON [n].[ObjectId] = [v].[ObjectId]
INNER JOIN (
SELECT [t].....
FROM (
SELECT [p]...., ROW_NUMBER() OVER(PARTITION BY [p].[ObjectId] ORDER BY [p].[ActualDepartureTime] DESC) AS [row]
FROM [Passage] AS [p]
WHERE ([p].[ActualDepartureTime] <= @__fromTime_1)
) AS [t]
WHERE [t].[row] <= 1
) AS [t0] ON [n].[ObjectId] = [t0].[ObjectId]
As you can clearly see, the Preview 5 generated query is clear and effective while the RC1 generated query is off. Please fix this query generation pattern.
Further technical details
EF Core version: 3.0 RC1 (versus 3.0 Preview 5) Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET Core 3.0 Operating system: Windows 10 IDE: Visual Studio 2019 16.2.5
@roji - I believe you investigated some perf work around rownumber. Can you answer why generating RowNumber is more efficient than CROSS APPLY?
I'll take a look at this soon.
Some very basic research I did back in June on window functions vs. PostgreSQL lateral join (which is equivalent to SQL Server cross apply in this context), showing window functions to be superior. I'll investigate the scenario above more deeply to understand the exact differences etc.
See also this article referenced by @divega at the time: https://blog.jooq.org/2017/09/22/how-to-write-efficient-top-n-queries-in-sql/
Here is a comparison between lateral join and window function, for getting the first row. Window functions win hands down (look at the second cost number on the outermost plan node). In theory the plan could vary due to table size (I only had very few rows), but I have no time to go deeper.
For info on reading PostgreSQL EXPLAIN results: https://www.postgresql.org/docs/current/using-explain.html
Lateral join
test=# EXPLAIN SELECT o.*
test-# FROM customers AS c,
test-# LATERAL (
test(# SELECT *
test(# FROM orders
test(# WHERE customer_id = c.id
test(# ORDER BY price
test(# LIMIT 1
test(# ) as o;
QUERY PLAN
---------------------------------------------------------------------------
Nested Loop (cost=35.55..45199.77 rows=1270 width=12)
-> Seq Scan on customers c (cost=0.00..22.70 rows=1270 width=4)
-> Limit (cost=35.55..35.55 rows=1 width=12)
-> Sort (cost=35.55..35.57 rows=10 width=12)
Sort Key: orders.price
-> Seq Scan on orders (cost=0.00..35.50 rows=10 width=12)
Filter: (customer_id = c.id)
Window function
test=# EXPLAIN SELECT o.*
test-# FROM customers AS c
test-# JOIN (
test(# SELECT *, rank() OVER (PARTITION BY o.customer_id ORDER BY o.id)
test(# FROM orders AS o
test(# ) AS o ON o.customer_id = c.id AND o.rank = 1;
QUERY PLAN
---------------------------------------------------------------------------------------------
Hash Join (cost=208.97..236.53 rows=10 width=20)
Hash Cond: (c.id = o.customer_id)
-> Seq Scan on customers c (cost=0.00..22.70 rows=1270 width=4)
-> Hash (cost=208.84..208.84 rows=10 width=20)
-> Subquery Scan on o (cost=142.54..208.84 rows=10 width=20)
Filter: (o.rank = 1)
-> WindowAgg (cost=142.54..183.34 rows=2040 width=20)
-> Sort (cost=142.54..147.64 rows=2040 width=12)
Sort Key: o_1.customer_id, o_1.id
-> Seq Scan on orders o_1 (cost=0.00..30.40 rows=2040 width=12)
In our case with MS SQL the latter query (select with joins from select) could not complete at all. Either PostgreSQL and MS SQL use very different ways to optimize such queries, or the optimizations heavily depend on other database specifics, such as indexes, etc.
Thanks @dmitry-slabko, I'll do a more in-depth investigation and comparison of the two databases in the next few days.
@roji - I can send to you the two complete queries, as they are a bit larger then I initially posted with their execution plans.
That could definitely help - along with the actual schema please. The actual data could also be relevant, or at least the number of rows - plans can sometimes be different based on table size etc.
This is the full linq query:
from navObject in Context.NavObjects
join vessel in Context.Vessels on navObject.VesselId equals vessel.VesselId
from passage in Context.Passages
.Where(x => x.VesselId == navObject.VesselId && x.ActualDepartureTime.HasValue && x.ActualDepartureTime.Value <= fromTime)
.OrderByDescending(x => x.ActualDepartureTime)
.Take(1)
.DefaultIfEmpty()
from simulationPoint in Context.SimulationDetails
.Where(s => !isPresent && s.PassageId == passage.PassageId && s.Time <= fromTime)
.OrderByDescending(s => s.Time)
.Take(1)
.DefaultIfEmpty()
from position in Context.Positions
.Where(p => p.VesselId == navObject.VesselId && p.Time <= fromTime)
.OrderByDescending(p => p.Time)
.Take(1)
.DefaultIfEmpty()
from trackPoint in Context.TrackPoints
.Where(t => t.VesselId == navObject.VesselId && t.Time <= fromTime)
.OrderByDescending(t => t.Time)
.Take(1)
.DefaultIfEmpty()
where simulationPoint != null || position != null || trackPoint != null
I attached both generated queries, their execution plans, and schema files for involved tables. plans-and-queries.zip
Thanks. As that query uses the same construct multiple times, if there's a perf issue it would indeed be magnified here. I'll dive into this soon.
@dmitry-slabko am exploring these two options and seeing some interesting phenomena with window functions and cross apply.
However, to complete the picture, could you please also submit your EF Core model? A simple, self-contained console program would be ideal, including the model and the problematic query.
Note also that the tables.sql you provided contains some issues (e.g. references to [dbo].[ObjectType] which does not exist)
@ajcvickers - I propose punting this issue.
I'd like to at least complete the investigation, even though I agree there's little chance we'll actually change anything for 3.1.
Ok, some more input on this problem. Here is the linq:
from vessel in Context.Vessels.Where(...)
from position in Context.Positions
.Where(t => t.VesselId == vessel.VesselId && t.Time <= fromTime)
.OrderByDescending(s => s.Time)
.Take(1)
.DefaultIfEmpty()
select new LocationPoint ...
The meaning is to get the latest point for each vessel id. In 3.0 Preview 5 this would generate such SQL:
SELECT ... FROM [Vessel] AS [v]
CROSS APPLY (
SELECT [t3].*
FROM (
SELECT NULL AS [empty]
) AS [empty1]
LEFT JOIN (
SELECT TOP(1) [p0].*
FROM [Position] AS [p0]
WHERE ([p0].[ObjectId] = [v].[ObjectId]) AND ([p0].[Time] <= @__fromTime_4)
ORDER BY [p0].[Time] DESC
) AS [t3] ON 1 = 1
) AS [t4]
The subquery to retrieve data from Position is effectively filtered.
Now, since Preview 5 and until 3.1 release, the query is such:
SELECT ... FROM [Vessel] AS [v]
LEFT JOIN (
SELECT ...
FROM (
SELECT ..., ROW_NUMBER() OVER(PARTITION BY [p].[ObjectId] ORDER BY [p].[Time] DESC) AS [row]
FROM [Position] AS [p]
WHERE [p].[Time] <= @__fromTime_1
) AS [t]
WHERE [t].[row] <= 1
) AS [t0] ON [v].[ObjectId] = [t0].[ObjectId]
And this is the problem - the inner subquery retrieves all rows from Position table, and in our case it is 16+ million rows, which may even be much more for some other customers. However, the subquery is executed for each row in the master query. So, it appears that the use of partitioned queries for MS SQL was based on wrong assumptions, as this pattern generates queries that will not perform quite well even on small data sets, while on large data sets they simply kill the reader.
I cannot say how this pattern behaves on other servers, such as PosgreSQL and Oracle, but for MS SQL it is not applicable. I would highly recommend to change the query generation pattern for such linq expressions back to what it was up until 3.0 Preview 5.
@smitpatel just a quick guess, but shouldn't the WHERE [t].[row] <= 1 in the second query be inside the inner-most subquery? Not sure this would have an impact but seems a bit more correct regardless?
In any case I'll try to find time to look into this, although it won't be right away.
I tried to execute the innermost query, and its performance issue comes from this part: ROW_NUMBER() OVER(PARTITION BY [t].[VesselId] ORDER BY [t].[EventDateTimeUTC] DESC) AS [row] Most likely, bad performance here is due to large row count. So, moving the check on t.row<=1 will not help, as the partitioning itself is the culprit.
One more note: the table even has an index on VesselId + EventDateTimeUTC columns, yet it does not help.
just a quick guess, but shouldn't the WHERE [t].[row] <= 1 in the second query be inside the inner-most subquery? Not sure this would have an impact but seems a bit more correct regardless?
Nope. The innermost subquery computes value of row in projection. Any filter on such computed column has to be on outer level. Where cannot use computed column from projection.
Ok, yet another note :) I created a composite index in VesselId ASC and EventDateTimeUTC DESC, and with this index the query finally got to behave. However, I do not think this is a proper solution - we cannot have indexes for all possible querying needs.
Note from triage: putting this in 5.0 to test the perf here. This doesn't mean we will necessarily change anything for the 5.0 release.
The main difference between the queries as they are now (with 'over partition by') and how they used to be in preview 5 (with 'cross apply') is that now they operate over the whole table, thus giving bad performance on large data sets regardless of how good indexes are, and before the cross apply section had a limiting condition that forced the query to work only on a limited row set, thus giving much better performance. This is why the current pattern is degrading performance to a degree it must be considered a bug.
The main difference between the queries as they are now (with 'over partition by') and how they used to be in preview 5 (with 'cross apply') is that now they operate over the whole table
That's not quite true - there are WHERE clauses there that restrict the scan, just like there are in the CROSS APPLY case. The fact that the WHERE clause is in an outer query in principle doesn't have to make a difference here. I also ran some preliminary perf tests before and the perf characteristics are not as simple as they seem.
I'm not denying this needs to be investigated (we do have this in the 5.0.0) milestone - I definitely intend to get around to it and understand the full picture.
WHERE ([p0].[ObjectId] = [v].[ObjectId]) AND ([p0].[Time] <= @__fromTime_4) will produce a smaller result set than WHERE [p].[Time] <= @__fromTime_1 and this is critical for a subquery expression.
Well, basically I see only two options for us to work-around this problem, as it is a show-stopper, - either create a stored procedure with the query that works well or fork and fix the query generator for our needs... We cannot wait till 5.0 (possibly) solves this.
@dmitry-lipetsk databases don't necessarily evaluate an entire subquery, extract the result, and then apply filters; the planner may have knowledge about WHERE clauses from outside the subquery. It's especially risky to assume anything like this across databases.
I'd consider defining the appropriate index(es) or using raw SQL before trying to change the query generator. Unfortunately there's little chance I'll be able to investigate this in the very near future, as there are many other things in the pipe.
@dmitry-slabko I also encountered with apply issue from EF Core 2.0 because it was need to appy multiple subqueries to the main, and then continue to use it by another consumers and in the complicated chains of query, EF Core 2 do a lot of evaluations on client side.
After exploring source code, I made the next extensions:
public static IQueryable<JoinResult<TOuter, TInner>> ApplySubQuery<TOuter, TInner>(this IQueryable<TOuter> outerQuery, Expression<Func<TOuter, IEnumerable<TInner>>> collectionSelector)
{
var joinedQuery = outerQuery.SelectMany(collectionSelector, (o, i) => new JoinResult<TOuter, TInner>
{
Outer = o,
Inner = i
});
return joinedQuery;
}
Linq examples:
var resultUsingCrossApply = await _dbContext
.Patients
.ApplySubQuery(x => _dbContext.Appointments.Where(y => y.PatientId == x.Id).OrderByDescending(y => y.ConfirmationDate).Take(1))
.Select(x => new
{
x.Outer.FullName,
x.Inner.ConfirmationDate,
x.Inner.AppointmentType.Name
})
.OrderBy(x => x.FullName)
.Take(100)
.ToListAsync();
var resultUsingPseudoOuterApply = await _dbContext
.Patients
.OrderBy(x => x.LastName)
.Take(100)
.ApplySubQuery(x => _dbContext.Appointments.Where(y => y.PatientId == x.Id).OrderByDescending(y => y.ConfirmationDate).Take(1).DefaultIfEmpty())
.Select(x => new
{
x.Outer.FullName,
x.Inner.ConfirmationDate,
x.Inner.AppointmentType.Name
})
.OrderBy(x => x.FullName)
.Take(100)
.ToListAsync();
Produced SQL:
exec sp_executesql N'SELECT TOP(@__p_0) [x].[FullName], [t].[ConfirmationDate], [i.AppointmentType].[Name]
FROM [dbo].[Patients] AS [x]
CROSS APPLY (
SELECT TOP(1) [y].*
FROM [dbo].[Appointments] AS [y]
WHERE [y].[PatientId] = [x].[Id]
ORDER BY [y].[ConfirmationDate] DESC
) AS [t]
INNER JOIN [dbo].[AppointmentTypes] AS [i.AppointmentType] ON [t].[AppointmentTypeId] = [i.AppointmentType].[Id]
ORDER BY [x].[FullName]',N'@__p_0 int',@__p_0=100
exec sp_executesql N'SELECT TOP(@__p_1) [t].[FullName], [t1].[ConfirmationDate], [i.AppointmentType].[Name]
FROM (
SELECT TOP(@__p_0) [x].*
FROM [dbo].[Patients] AS [x]
ORDER BY [x].[LastName]
) AS [t]
CROSS APPLY (
SELECT [t0].*
FROM (
SELECT NULL AS [empty]
) AS [empty]
LEFT JOIN (
SELECT TOP(1) [y].*
FROM [dbo].[Appointments] AS [y]
WHERE [y].[PatientId] = [t].[Id]
ORDER BY [y].[ConfirmationDate] DESC
) AS [t0] ON 1 = 1
) AS [t1]
LEFT JOIN [dbo].[AppointmentTypes] AS [i.AppointmentType] ON [t1].[AppointmentTypeId] = [i.AppointmentType].[Id]
ORDER BY [t].[FullName]',N'@__p_1 int,@__p_0 int',@__p_1=100,@__p_0=100
But, after migrating my project to EF Core 3.0, I noticed that approach starts to produce RowNumber + Join instead of Apply. Therefore, I dived into EF Core source code again and added the "hack" to my exntesions:
public static IQueryable<JoinResult<TOuter, TInner>> ApplySubQuery<TOuter, TInner>(this IQueryable<TOuter> outerQuery, Expression<Func<TOuter, IEnumerable<TInner>>> collectionSelector)
{
// build hack condition to force EF Core 3.0 generate Apply
var outerParameter = collectionSelector.Parameters[0];
var notNullExpression = Expression.NotEqual(outerParameter, Expression.Constant(null, outerParameter.Type));
var innerParameter = Expression.Parameter(typeof(TInner));
Expression<Func<TInner, bool>> notNullLambda = Expression.Lambda<Func<TInner, bool>>(notNullExpression, innerParameter);
var whereGenericMethod = whereMethodInfo.MakeGenericMethod(typeof(TInner));
var collectionSelectorBody = Expression.Call(whereGenericMethod, collectionSelector.Body, notNullLambda);
collectionSelector = Expression.Lambda<Func<TOuter, IEnumerable<TInner>>>(collectionSelectorBody, outerParameter);
var joinedQuery = outerQuery.SelectMany(collectionSelector, (o, i) => new JoinResult<TOuter, TInner>
{
Outer = o,
Inner = i
});
return joinedQuery;
}
In that case, EF Core 3.0 generate apply operators. Also, unlike 2.2, for DefaultIfEmpty it's generate OUTER APPLY instead of CROSS + 2 subqueries with join:
exec sp_executesql N'SELECT TOP(@__p_0) [p].[FullName], [t0].[ConfirmationDate], [a0].[Name]
FROM [dbo].[Patients] AS [p]
CROSS APPLY (
SELECT [t].*
FROM (
SELECT TOP(1) [a].*
FROM [dbo].[Appointments] AS [a]
WHERE [a].[PatientId] = [p].[Id]
ORDER BY [a].[ConfirmationDate] DESC
) AS [t]
WHERE CAST(1 AS bit) = CAST(1 AS bit)
) AS [t0]
INNER JOIN [dbo].[AppointmentTypes] AS [a0] ON [t0].[AppointmentTypeId] = [a0].[Id]
ORDER BY [p].[FullName]',N'@__p_0 int',@__p_0=100
exec sp_executesql N'SELECT TOP(@__p_0) [p].[FullName], [t0].[ConfirmationDate], [a0].[Name]
FROM [dbo].[Patients] AS [p]
OUTER APPLY (
SELECT [t].*
FROM (
SELECT TOP(1) [a].*
FROM [dbo].[Appointments] AS [a]
WHERE [a].[PatientId] = [p].[Id]
ORDER BY [a].[ConfirmationDate] DESC
) AS [t]
WHERE CAST(1 AS bit) = CAST(1 AS bit)
) AS [t0]
LEFT JOIN [dbo].[AppointmentTypes] AS [a0] ON [t0].[AppointmentTypeId] = [a0].[Id]
ORDER BY [p].[FullName]',N'@__p_0 int',@__p_0=100
Also, as you can see, EF generate redundant condition WHERE CAST(1 AS bit) = CAST(1 AS bit)
.it's happening only because the dynamic expression x => x != null is added at the end of the collection expression. If you manually write it inside your Where clause (like that ApplySubQuery(x => _dbContext.Appointments.Where(y => y.PatientId == x.Id && x != null).OrderByDescending(y => y.ConfirmationDate).Take(1)), then this SQL statement will be ommited.
The main disadvantages of this approach for me, the restriction of SelectMany method signature: you can't return aggregate result in subquery expression, the selector must always be enumerative.
Hope, that will help you in your issues/researches.
@dmitry-slabko also, using that approach, I'm finally able to use the columns from subquery in GroupBy clause, which is one of the "fundamental" issue of EF Core.
EF Core dupplicates the subquery 'statement' into another clauses as WHERE, which can also have negative affects on performance or doesn't work at all (in GroupBy case).
I tried to build custom translator/generator, to implement ranking functions as ROW_NUMBER, RANK etc and was successed. But, until I tried to use the rank column in the step of expression, in WHERE clause for example. In that case, EF Core dupplicate my statement "ROW_NUMBER() OVER..." to the WHERE clause which are invalid from SQL perspective, instead of pushing it into subquery and use reference. And it's happening constantly with EF Core. Thats why I said that is a fundamental issue.
Standard Linq, non-working
await _dbContext
.Patients
.Select(x => new
{
x.ClinicCreatedId,
ConfirmationDate = _dbContext.Appointments
.Where(y => y.PatientId == x.Id)
.OrderByDescending(y => y.ConfirmationDate)
.Select(x => x.ConfirmationDate)
.FirstOrDefault()
})
.GroupBy(x => new { x.ClinicCreatedId, x.ConfirmationDate })
.Select(x => new
{
x.Key.ClinicCreatedId,
x.Key.ConfirmationDate,
PatientCount = x.Count()
})
.ToListAsync();
it's produced invalid SQL
SELECT [p].[ClinicCreatedId], (
SELECT TOP(1) [a].[ConfirmationDate]
FROM [dbo].[Appointments] AS [a]
WHERE [a].[PatientId] = [p].[Id]
ORDER BY [a].[ConfirmationDate] DESC) AS [ConfirmationDate], COUNT(*) AS [PatientCount]
FROM [dbo].[Patients] AS [p]
GROUP BY [p].[ClinicCreatedId], (
SELECT TOP(1) [a].[ConfirmationDate]
FROM [dbo].[Appointments] AS [a]
WHERE [a].[PatientId] = [p].[Id]
ORDER BY [a].[ConfirmationDate] DESC)
Here the working example:
await _dbContext
.Patients
.ApplySubQuery(x => _dbContext.Appointments.Where(y => y.PatientId == x.Id).OrderByDescending(y => y.ConfirmationDate).Take(1))
.Select(x => new
{
x.Outer.ClinicCreatedId,
x.Inner.ConfirmationDate
})
.GroupBy(x => new { x.ClinicCreatedId, x.ConfirmationDate })
.Select(x => new
{
x.Key.ClinicCreatedId,
x.Key.ConfirmationDate,
PatientCount = x.Count()
})
.ToListAsync();
SELECT [p].[ClinicCreatedId], [t0].[ConfirmationDate], COUNT(*) AS [PatientCount]
FROM [dbo].[Patients] AS [p]
CROSS APPLY (
SELECT [t].*
FROM (
SELECT TOP(1) [a].*
FROM [dbo].[Appointments] AS [a]
WHERE [a].[PatientId] = [p].[Id]
ORDER BY [a].[ConfirmationDate] DESC
) AS [t]
WHERE CAST(1 AS bit) = CAST(1 AS bit)
) AS [t0]
GROUP BY [p].[ClinicCreatedId], [t0].[ConfirmationDate]
@itan-mcp - Your last post is not related to this issue as there is no RowNumber. Grouping by scalar subquery generating invalid SQL is being tracked at https://github.com/aspnet/EntityFrameworkCore/issues/19027
@smitpatel Thank you. Moved to appropriate thread.
Here is a comparison between lateral join and window function, for getting the first row. Window functions win hands down (look at the second cost number on the outermost plan node). In theory the plan could vary due to table size (I only had very few rows), but I have no time to go deeper.
That test handles the case when we fetch all data from the first table. When we fetch only one or some rows from the left table, the results vary a lot due to the time complexity of the chosen algorithm by the query planner.
Example:
Let there be a 1:N relationship between tbl1 and tbl2. Let tbl1 have tens of thousands of items, and a handful items in tbl2 for every item in tbl1. We assume there exist indexes for the foreign key and all identifiers.
db.tbl1s.Include(t1 => t1.tbl2s.OrderBy(t2 => t2.id).Take(1)).Where(t1 => t1.id == 3).ToList();
db.tbl1s.Include(t1 => t1.tbl2s.OrderBy(t2 => t2.id).Take(1)).Take(10).ToList();
SQL:
SELECT `t`.`id`, `t`.`col1`, `t2`.`id`, `t2`.`col1`, `t2`.`tbl1id`
FROM `tbl1` AS `t`
LEFT JOIN (
SELECT `t1`.`id`, `t1`.`col1`, `t1`.`tbl1id`
FROM (
SELECT `t0`.`id`, `t0`.`col1`, `t0`.`tbl1id`, ROW_NUMBER() OVER(PARTITION BY `t0`.`tbl1id` ORDER BY `t0`.`id`) AS `row`
FROM `tbl2` AS `t0`
) AS `t1`
WHERE `t1`.`row` <= 1
) AS `t2` ON `t`.`id` = `t2`.`tbl1id`
WHERE `t`.`id` = 3
ORDER BY `t`.`id`, `t2`.`tbl1id`, `t2`.`id`
and
SELECT `t0`.`id`, `t0`.`col1`, `t3`.`id`, `t3`.`col1`, `t3`.`tbl1id`
FROM (
SELECT `t`.`id`, `t`.`col1`
FROM `tbl1` AS `t`
LIMIT 10
) AS `t0`
LEFT JOIN (
SELECT `t2`.`id`, `t2`.`col1`, `t2`.`tbl1id`
FROM (
SELECT `t1`.`id`, `t1`.`col1`, `t1`.`tbl1id`, ROW_NUMBER() OVER(PARTITION BY `t1`.`tbl1id` ORDER BY `t1`.`id`) AS `row`
FROM `tbl2` AS `t1`
) AS `t2`
WHERE `t2`.`row` <= 1
) AS `t3` ON `t0`.`id` = `t3`.`tbl1id`
ORDER BY `t0`.`id`, `t3`.`tbl1id`, `t3`.`id`
PostgreSQL can handle the first one very efficiently, by "pushing up" the id=3 condition into the subquery. So the full tbl2 will not need to be scanned, only the ones with tbl1id = 3. Also, even if we omit the order by, PostgreSQL cannot optimize the <= 1 to understand that it only needs to look at one single row. It will evaluate the row_number() function for every row in the partition and then pick those (the only one) with <= 1.
PostgreSQL does not seem to handle the second query efficiently. It will perform a full table scan on tbl2. For some reason if we remove the order condition in the partition it will use an index scan and be fast.
MySQL 8 performs a full table scan on tbl2 for both cases and will hence get a very bad performance if tbl2 is big.
Both PostgreSQL and MySQL 8 however manages to use the indexes correctly if we use a LATERAL JOIN construction with WHERE and LIMIT expressions inside, avoiding a full table scan on tbl2.
So I think this issue should really be reconsidered, and therefore start using lateral joins instead, at least if the left hand side of the join is limited by some condition, rather than all rows in the table. It should be a pretty common task to fetch Top 1 or Top N values from a subtable for one or a few entities in an outer table.
Note that for MariaDB, old MySQL and Sqlite which don't support lateral joins, the windowing functions must be used anyway. An exception is for if we only want to fetch a single row, we can do like this:
SELECT *,
(SELECT tbl2.id FROM tbl2 WHERE tbl2.tbl1id = tbl1.id ORDER BY tbl2.id LIMIT 1) AS tbl2id,
(SELECT tbl2.col1 FROM tbl2 WHERE tbl2.tbl1id = tbl1.id ORDER BY tbl2.id LIMIT 1) AS tbl2col1,
...
FROM tbl1
This is done in the MySQL provider for EF6 where other providers use lateral joins. Not so efficient maybe, but works at least.