framework icon indicating copy to clipboard operation
framework copied to clipboard

[10.x] Add Lateral Join to Query Builder

Open Bakke opened this issue 1 year ago • 17 comments

This PR adds lateral join methods to the query builder. I have based the methods and tests upon joinSub as the behaviour is somewhat similar.

Lateral joins have been supported in PostgreSQL for a long time (since 9.3), and are also supported in MySQL since 8.0.14.

The lateral joins are a bit similar to a subquery join, as the right-hand table is expressed as a subquery. However the right-hand expression is executed for every row, which opens up for new use cases. It is particularly useful for joining in the top 'N' rows related to the main query. In example:

select * from "users" 
join lateral (
    select * from "contacts" 
    where "contracts"."user_id" = "users"."id"
    order by "contracts"."value" desc
    limit 5
) as "sub" 
    on true

Bakke avatar Feb 12 '24 09:02 Bakke

I think we would normally want to keep that translation to "lateral" SQL syntax in the query "grammar" layer.

taylorotwell avatar Feb 12 '24 16:02 taylorotwell

@taylorotwell Yes, that is probably best. Since "lateral" is not a join type, and should be placed after the "join" keyword, it might be necessary with an extra property in the JoinClause class. Either a string, or a $lateral boolean. The "on true" translation should probably also go into the "grammar" layer.

Bakke avatar Feb 12 '24 22:02 Bakke

SQL server has Cross Apply And Outer Apply.

CROSS APPLY = lateral join Outer Apply = left lateral join

Don't think it supports right lateral join

https://www.mssqltips.com/sqlservertip/1958/sql-server-cross-apply-and-outer-apply/

crissi avatar Feb 12 '24 22:02 crissi

@crissi Postgres and MySQL does not support right lateral join neither (it would make no sense). My mistake. Removing it from the PR now.

I will look into translating it into cross/outer apply for SQL server as well.

Bakke avatar Feb 12 '24 23:02 Bakke

I have moved "lateral" and "on true" translation into the grammar layer now, and added support for SQL server. I have tested joinLateral with PostgreSQL 16, MySQL 8 and SQL server 2022 in a samle app.

@taylorotwell I am a bit uncertain of two things in the PR:

  • I added a new boolean $lateral to the JoinClause class (with default false). This feels a bit dirty. Any better suggestions?
  • I added a joinLateral to the grammar layer, which accepts the current query builder and current join clause. This leads to code duplication for generating $tableAndNestedJoins, but provides more flexaility for each language translation. However it might be better to just pass the compiled $tableAndNestedJoins as a string, instead of the whole join clause?

Bakke avatar Feb 13 '24 14:02 Bakke

Just wanna chime in because MariaDB hasn't added support for lateral join yet. And Laravel considers MySQL and MariaDB as same RDBMS under mysql driver.

Rizky92 avatar Feb 14 '24 05:02 Rizky92

Just wanna chime in because MariaDB hasn't added support for lateral join yet. And Laravel considers MySQL and MariaDB as same RDBMS under mysql driver.

I know. MariaDB uses the same translations as MySQL, so it will fail with a syntax error as it is not supported yet. I do believe they are planning on implementing support though.

Bakke avatar Feb 14 '24 08:02 Bakke

👍 for this feature. @tpetry and I were just talking about it last week. We can also use lateral joins to improve the performance of some of Laravel's own queries.

Tobias has already implemented lateral joins for PostgreSQL in his laravel-postgresql-enhanced package and I would take inspiration from his code:

  • I think his method names (joinSubLateral() etc.) are more consistent since lateral joins are a special type of joinSub().
  • Tobias found a way to implement it without changing the JoinClause class: https://github.com/tpetry/laravel-postgresql-enhanced/blob/master/src/Query/BuilderLateralJoin.php#L20 Can we use the same approach? The new $lateral parameter isn't ideal.
  • Another feature of Tobias' implementation is that you can add custom join constraints if necessary (on a.foo = b.bar instead of only on true): https://github.com/tpetry/laravel-postgresql-enhanced/blob/master/src/Query/BuilderLateralJoin.php#L37

I also think it would be good to have integration tests (under tests/Integration/Database).

staudenmeir avatar Feb 15 '24 17:02 staudenmeir

@staudenmeir Good tip! I will checkout this package in detail and draw some inspiration from there.

  • I agree that joinSubLateral etc. are more consistent. I will change the method names.
  • Tobias' implementation would put the translation to "lateral" SQL syntax back into the query builder, and out of the grammar layer (where @taylorotwell requested it should be). This would make it more difficult to translate to cross apply / outer apply for SQL server in at meaningful way. I agree that the new $lateral parameter isn't ideal, but I have not found a better way of passing "lateral" to the grammar layer (as it is independent of the join $type). Any suggestions on how to solve this are welcome.
  • I thought of custom join constraints, instead of only on true. I didn't think they would be used, so I decided to drop the extra parameters for the methods. I can add support for this with a default to on true.

Bakke avatar Feb 15 '24 17:02 Bakke

@staudenmeir I am looking into custom join constraints (on a.foo = b.bar). SQL servers cross apply / outer apply don't support this, so these constraints will cause an error (or will have to be removed).

I tested lateral joins with custom constraints outside the subquery in Postgres. The results are the same as when I put the constraints inside the subquery. Performance also seems to be about the same.

Are there any use cases where putting the constraints outside the subquery provides an advantage? If not, I am afraid of just adding in unnecessary complexity that can cause confusion.

Bakke avatar Feb 16 '24 08:02 Bakke

There's not really any benefit of using a join condition compared to just always using ON true. But I am not sure anymore whether there was a reason for the join condition for the lateral left join. Or whether i just did it that way to fit to the other join methods.

Would be great if you could build the API to be the same as mine (when using the same name) to not break my package. You could e.g. only use the first two params (and always do ON true). The current implementation would break my implementation.

tpetry avatar Feb 16 '24 08:02 tpetry

@tpetry I am leaning against dropping the join conditions then, to keep it simple and consistent across different SQL implementations.

We have three choices to not break your current implementation in tpetry/laravel-postgresql-enhanced:

  • Keep the parameters, but ignore them and always do on true (I would like to avoid that)
  • Revert to my original method names, so they don't collide (joinLateral and leftJoinLateral)
  • Merge this against the 11.x branch (instead of 10.x) and do a breaking change

Bakke avatar Feb 16 '24 11:02 Bakke

I would suggest these three methods:

  • crossJoinSubLateral($query, string $as)
  • leftJoinSubLateral($query, string $as)

They fit to my implementation as I only add some parameters - which I believe are not even usefull. The joinSubLateral implementation of you with type = inner could even be dropped as it doesn't add anything. A lateral join is always used in a cross join behaviour, either by using a cross join (with parameters in the subquery) or by using a join condition.

So it would make the concept more easy by saying that the join target is always a subquery that needs to apply the filters and all results are used - as expected.

tpetry avatar Feb 16 '24 12:02 tpetry

@tpetry @staudenmeir I suggest reverting to joinLateral and leftJoinLateral, to keep the methods separate from the laravel-postgresql-enhanced package. That way, users can decide to switch from the package methods to builtin methods.

If I drop the joinSubLateral implementation with $type = 'inner' I will have to duplicate code in leftJoinSubLateral. I don't see a good way to keep joinSubLateral without breaking the package (I would have to add unused parameters), so I would rather rename it.

A lateral join in its simplest form is join lateral (sql subquery here) as "q" on true) (inner join). I think joinLateral reflects this in a simple way. joinSubLateral is a bit more descriptive, but collides with the package.

I would also like to avoid using "cross" in the method names, unless we explicitly include a cross join method. A lateral join is always used in a cross join behaviour, but you would probably write the SQL as join lateral or inner join lateral. I think a separate cross join method will be redundant.

In short: I think dropping "Sub" from the method names is a fair trade-off for not breaking the package, and not including any unused parameters.

Bakke avatar Feb 16 '24 23:02 Bakke

Renaming to not break my package would be nice 👍

But I still don't get the $type = 'inner' param of the query. Is that to supply a left value? A specific method for that would match more to the other query builder methods.

tpetry avatar Feb 19 '24 09:02 tpetry

Renaming to not break my package would be nice 👍

But I still don't get the $type = 'inner' param of the query. Is that to supply a left value? A specific method for that would match more to the other query builder methods.

$type = 'inner' is to supply the left value, yes. The only difference between joinLateral and leftJoinLateral is the left value. This way leftJoinLateral can just call joinLateral, without duplicating any code: https://github.com/laravel/framework/blob/b31ddd1b251a3b17b1e47faf72f448cd87d51793/src/Illuminate/Database/Query/Builder.php#L619

The same is done for leftJoinSub.

Bakke avatar Feb 19 '24 09:02 Bakke

Converting this to draft until the discussion is ironed out. Please mark as ready for review when you're ready for me to take a look.

taylorotwell avatar Feb 19 '24 15:02 taylorotwell

@Bakke I had another idea for the $lateral parameter: A LateralJoinClause subclass of JoinClause IMO, that's a better solution and "more OOP". The class would be empty, but compileJoins() can use it to detect lateral joins.

staudenmeir avatar Feb 20 '24 22:02 staudenmeir

@Bakke I had another idea for the $lateral parameter: A LateralJoinClause subclass of JoinClause IMO, that's a better solution and "more OOP". The class would be empty, but compileJoins() can use it to detect lateral joins.

I can change this and use if ($join instanceof LateralJoinClause), if the other participants and reviewer also thinks this is a better solution. Any other thoughts on this?

For the other discussion points:

  • I think we are in agreement to drop join constraints outside the subquery?
  • Is it ok to revert to joinLateral and leftJoinLateral, instead of joinLateralSub, so we don't break the tpetry/laravel-postgresql-enhanced package?

Bakke avatar Feb 20 '24 22:02 Bakke

  • Is it ok to revert to joinLateral and leftJoinLateral, instead of joinLateralSub, so we don't break the tpetry/laravel-postgresql-enhanced package?

Yes 😅

tpetry avatar Feb 21 '24 08:02 tpetry

@staudenmeir I rewrote the code to use a JoinLateralClause class instead now. @dkulyk I left this change as a separate commit. As a reviewer I thought you could decide if you like this solution better than the bool $lateral approach.

Bakke avatar Feb 21 '24 11:02 Bakke

Thanks. Can you please add integration tests? The queries should be tested against actual databases.

staudenmeir avatar Feb 21 '24 13:02 staudenmeir

Thanks. Can you please add integration tests? The queries should be tested against actual databases.

Should I create separate LateralJoinTest.php for MySQL, Postgres and SQL server under tests/Integration/Database?

Bakke avatar Feb 21 '24 13:02 Bakke

Yes, please.

staudenmeir avatar Feb 21 '24 13:02 staudenmeir

@staudenmeir I added some simple integration tests now. They are skipped for MySQL < 8.0.14 and MariaDB.

Bakke avatar Feb 21 '24 16:02 Bakke

Excellent, thanks. From my side, the PR is ready (I can't approve it).

staudenmeir avatar Feb 21 '24 16:02 staudenmeir

@staudenmeir Great, thanks! Marking it as ready for review so @taylorotwell can look over it as well.

Bakke avatar Feb 21 '24 17:02 Bakke

@staudenmeir I made a small change to the integration tests. limit is changed to 2 and assertCount is used to check the number of returned rows. I think it makes more sense to test that limit actually can control the number of returned rows (main query rows X subquery rows).

Bakke avatar Feb 21 '24 17:02 Bakke

Thanks!

taylorotwell avatar Feb 25 '24 15:02 taylorotwell

@Bakke can you send a PR to laravel/docs for this?

taylorotwell avatar Feb 27 '24 16:02 taylorotwell