firebird
firebird copied to clipboard
Transform LEFT joins into INNER ones if the WHERE condition violates the outer join rules
Example:
SELECT *
FROM T1 LEFT JOIN T2 ON T1.ID = T2.ID
WHERE T2.FIELD1 = 0
In this case the condition T2.FIELD1 = 0 effectively removes all the "fake NULL" rows of T2, so the result is the same as for the INNER JOIN. However, the optimizer is forced to use the T1->T2 join order while T2->T1 could also be considered. Also, the optimizer cannot use hash/merge algorithms for joining. It makes sense to detect this case during join processing and internally replace LEFT with INNER before optimization starts.
This is primarily intended to improve "ad hoc" and machine-generated (e.g. ORM) queries. However, it may also somewhat (from the performance POV) break user applications, as many people used to "hint" the optimizer by writing LEFT intentionally, exactly to "pin" the manually chosen join order instead of relying on the optimizer. Usually, the join is performed by the PK/FK reference and the fake WHERE condition explicitly checks that PK (which is known to never be NULL by its own) to produce the INNER result.
The proposed improvement, however, may work around this compatibility issue. In particular, checks for NULL, e.g. WHERE T2.ID IS NOT NULL for the join example above, would not transform LEFT into INNER, because theoretically they may also be used for checking fake NULLs inside T2 after LEFT JOIN. However, regular comparisons that ignore NULLs by their nature, will cause the LEFT->INNER transformation. So only hints using very artificial checks like T2.ID > 0 will be affected.
This patch has been tested in production for a while and nobody stepped on problems yet. However, in order to minimize possible risks, perhaps it should be added to the major release (v5) only, or we could provide a per-database compatibility switch in firebird.conf.
Hi Dmitry!
I recommend that this could be a per-database compatibility switch in firebird.conf, and turn off as default setting. We intentionally use this method in places where Firebird uses a bad plan for very complex querys. It would be a serious job to explore all the places where we use this solution. And we don't know for sure whether every colleague always uses a primary key or unique key field in the where condition to check not null.
András
In my opinion, this should not be off by default, new features or improvements that are off by default don't get used.
Per-database options is a huge pain for support. In the major release it would be better to register cases of generated bad plans as bugs for separate optimizer improvements.
@mrotteveel : We have clients who do not have physical access to the firebird config files (or firebird is used by other applications with different needs). Databases are created dynamically using our program without modifying and fierbird config files... If this behaviour can be modified in an on connect trigger, that is a sufficient solution too.
@aafemt : we have not reported these cases so far because
- it takes too much time a test database
- the plan seems logical (analyzing the index statistics), but
- cause a significant record reading surplus, leading to a serious slowdown
Nevertheless, if such an anomaly occurs again, I promise we will try to bring it into a reportable format.
@omachtandras, it's not really a question whether PK/UK field is used or some other fields. If that field is not nullable and you check it via FIELD IS NOT NULL -- everything is OK, nothing will be changed for you. If you use some other conditions (like FIELD > 0 mentioned in this ticket), then plans may change. Can you check that?
@dyemanov : thanks for the clarification. For sure, we always use a not nullable field with field is not null in the where conditions.
I think it's worth adding that verification (that IS NOT NULL does not break plans) to our QA tests if/when this improvement is committed.
@omachtandras , @aafemt this is the problematic case for us : https://github.com/FirebirdSQL/firebird/issues/1476 In this case we use the LEFT OUTER workaround.
If I understand correctly, it can be disabled like this
SET QUERY TRANSFORMATION OFF
in the BEFORE CONNECT trigger or just before the query is executed
Another MSSQL-style option is the last thing Firebird needs IMHO.
@aafemt I agree, but the globally disabling of some optimizer algorithms would also not hurt, since it is not always possible to rewrite the query in such a way as to assign hints to it.
If you want hints - implement hints. Or use already existing PLAN clause.
I vote for hints and for global switches. The PLAN clause is too limited, and in my opinion, even harmful.