squeel
squeel copied to clipboard
Controlling the Planner with Explicit JOIN Clauses (new feature ?) ?
Is it already possible ? If yes how ? Or Could it be added in features enhancement for futur releases ?
I'd like to be able to optimize queries like we do in SQL with parenthesis. It become significantely important when dealing with deeply nested table hierarchy and JOIN requests.
As explained here with Controlling the Planner with Explicit JOIN Clauses : http://www.postgresql.org/docs/9.2/static/explicit-joins.html
Extract :
In a simple join query, such as:
SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
the planner is free to join the given tables in any order.
...
When a query only involves two or three tables, there aren't many join orders to worry about. But the number of possible join orders grows exponentially as the number of tables expands. Beyond ten or so input tables it's no longer practical to do an exhaustive search of all the possibilities, and even for six or seven tables planning might take an annoyingly long time. When there are too many input tables, the PostgreSQL planner will switch from exhaustive search to a genetic probabilistic search through a limited number of possibilities. (The switch-over threshold is set by the geqo_threshold run-time parameter.) The genetic search takes less time, but it won't necessarily find the best possible plan.
When the query involves outer joins, the planner has less freedom than it does for plain (inner) joins. For example, consider:
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
Although this query's restrictions are superficially similar to the previous example, the semantics are different because a row must be emitted for each row of A that has no matching row in the join of B and C. Therefore the planner has no choice of join order here: it must join B to C and then join A to that result. Accordingly, this query takes less time to plan than the previous query. In other cases, the planner might be able to determine that more than one join order is safe ...
Thanks for your answer
I think it's not much difference between your sql and select * from a left outer join (select b.id from b inner join c on b.ref = c.id) as b on a.id = b.id
, so you can just use a subquery in Squeel now, it doesn't need to add any new feature I think.
My request is all about taking performances into consideration. Subquerries are the best choice for fairly simple requests but with huge amount of data (like above 100'000 entries) when paging swap file and memory usage could be a limit issue for a "Join querries". "Joins querries" are the best choice for complexe requests involving many tables & operators, and could transform your performances from hours to minutes.