Changed plan (possible regression) in FB 5.x / 6.x comparing to FB 3.x / 4.x
Problem was reported by one of our customer privately. Consider script from attached .zip ddl-and-run_-_for-empty-tables.sql.zip
Query that caused problems in customer environment has following plans:
- for FB 3.x and 4.x:
Select Expression
-> Aggregate
-> Filter
-> Nested Loop Join (outer)
-> Nested Loop Join (outer)
-> Filter
-> Table "PAYMENTSDETAILS" as "PD" Access By ID
-> Bitmap
-> Index "AAA_PD_PERIODFROMDATE" Range Scan (lower bound: 1/1)
-> Filter
-> Table "PAYMENTSH" as "PH" Access By ID
-> Bitmap
-> Index "PK_PAYMENTSH_MOMENT_GENEMPID" Unique Scan
-> Filter
-> Table "INVOICEH" as "IH" Access By ID
-> Bitmap Or
-> Bitmap
-> Index "PK_INVOICEH_INVID" Unique Scan
-> Bitmap
-> Index "INVOICEHBYPARENTINVOICEIDDESC" Range Scan (full match)
- for FB 5.x and 6.x:
Select Expression
-> Aggregate
-> Nested Loop Join (inner)
-> Filter
-> Table "INVOICEH" as "IH" Full Scan
-> Filter
-> Table "PAYMENTSH" as "PH" Access By ID
-> Bitmap Or
-> Bitmap
-> Index "IDX_PAYMENTSH_MASTERDOC_DESC" Range Scan (full match)
-> Bitmap
-> Index "IDX_PAYMENTSH_MASTERDOC_DESC" Range Scan (full match)
-> Filter
-> Table "PAYMENTSDETAILS" as "PD" Access By ID
-> Bitmap
-> Index "PK_PD_MOMENT_GENEMPID" Unique Scan
Customer states that second plan leads to performance problem. Additional data from customer:
- number of records per tables:
table_name rec_count
==============================
InvoiceH 953779
PaymentsDetails 783127
PaymentsH 30408945
- index statistics (only for those indices that are involved in above shown query):
paymentsdetails
pk_pd_moment_genempid 0.00000127693203921808163
aaa_pd_periodfromdate 0.00104058277793228626
paymentsh:
idx_paymentsh_masterdoc_desc 0.00000011831024692651353
pk_paymentsh_moment_genempid 0.0000000329129825615837035
invoiceh:
pk_invoiceh_invid 0.00000104854007076937705
invoicehbyparentinvoiceiddesc 0.0000217893411900149658
Both LEFT JOINs are converted into INNER here, so the optimizer attempts to evaluate possible table permutations. Condition 0+ih.createempid = 999999 significantly reduces the expected cardinality, thus IH is placed to the first position. How many rows in IH actually have createempid = 999999? What's selectivity of the invoicehbycreateempid index?
How many rows in IH actually have createempid = 999999?
I've forwarded this question to the customer, waiting for reply.
What's selectivity of the invoicehbycreateempid index?
0.000399680255213752389
It would also be useful to see stats (execution time & fetches) for v4, v5 and v5 without +0 in the last condition.
Both LEFT JOINs are converted into INNER here, so the optimizer attempts to evaluate possible table permutations. Condition
0+ih.createempid = 999999significantly reduces the expected cardinality, thus IH is placed to the first position. How many rows in IH actually havecreateempid = 999999? What's selectivity of theinvoicehbycreateempidindex?
So the number of rows with CreateEmpID = 999999 at the moment is 771402 and the selectivity is 0.000399680......
with FromDateTime = '01.10.2023' FB5 doesn't return anything in hours (we killed the statement after waiting for more than 6 hours) regardless if there is +0 or not in the last statement. If I remove the whole statement about CreateEmpID in InvoiceH the query runs in milliseconds with the following stats
` Query Time
Prepare : 93.00 ms Execute : 375.00 ms Avg fetch time: 375.00 ms
Memory
Current: 121 341 284 576 Max : 123 660 694 464 Buffers: 7 200 000
Operations
Read : 22 Writes : 231 Fetches: 1 525 479 Marks : 2 917
Enchanced Info: +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ | Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges | | | Total | reads | reads | | | | | | | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ |INVOICEH | 0 | 47411 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |PAYMENTSDETAILS | 0 | 45681 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |PAYMENTSH | 0 | 45681 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ `
And here are the stats for the Query ran in FB4 WITH the statement included ` Query Time
Prepare : 31.00 ms Execute : 515.00 ms Avg fetch time: 515.00 ms
Memory
Current: 4 258 615 968 Max : 4 464 718 112 Buffers: 250 000
Operations
Read : 7 023 Writes : 21 Fetches: 623 164 Marks : 1 864
Enchanced Info: +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ | Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges | | | Total | reads | reads | | | | | | | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ |INVOICEH | 0 | 47411 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |PAYMENTSDETAILS | 0 | 45681 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |PAYMENTSH | 0 | 45681 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ `
Тhe question is why is FB5 trying to change the JOINs from LEFT OUTER to INNER ones and is there any way we can force it NOT to do it? We are using these precisely because we know how these queries should go ... or at least we believe so :)
It does it because it allows more optimization abilities (different join orders). As for forcing it to avoid that, see #7910 - it's exactly about that. If you're OK with such a solution (configuration option), it will be added to RC2.
Yes, this configuration option will be very useful for us and we are really looking forward for you releasing a version that supports it. In the mean time we hope that you'll be able to also optimize the planner to deal with cases like the above one even without it active.
So the number of rows with CreateEmpID = 999999 at the moment is 771402 and the selectivity is 0.000399680......
So you ask for 80% of rows while the optimizer expects less than a thousand to be returned. Such a skewed value distribution is impossible to optimize properly without histograms. Their usage is also limited (only for constant parameters), but at least they provide an option.
yes, that's absolutely right and that's exactly the reason why we did our best to "hint" the optimizer not to try to use this field/index (by using the 0+ih.CreateEmpID syntax) ... but unluckily it tried to be smarter than what we expected ... and failed badly ;)
In addition to what dimitr said. There are suspicions that in the above example, parentinvoiceid contains many NULLs. The index invoicehbyparentinvoiceiddesc does not contain NULL selectivity. This means that its real selectivity for equality is distorted. Thus, the OR bitmask predicts poor selectivity, which may affect the join order.
You can try to create a partial index to replace the existing one.
drop index invoicehbyparentinvoiceiddesc;
create descending index invoicehbyparentinvoiceiddesc
on invoiceh (parentinvoiceid) where (parentinvoiceid is not null);
You can try to create a partial index to replace the existing one.
drop index invoicehbyparentinvoiceiddesc; create descending index invoicehbyparentinvoiceiddesc on invoiceh (parentinvoiceid) where (parentinvoiceid is not null);
I did try it but it doesn't change anything ... and to be honest I don't see how it will since there is no meaningful filtering on this table by ParentInvoiceID and expectedly the plan remains "JOIN (IH NATURAL, PH INDEX ....).
But there is a good news - if we do set the option OuterJoinConversion to false everything returns back to a working state and the query runs in 2.3sec.
I just hope that this comment for the option in the .config file will not materialize
CAUTION!
There is no guarantee that this setting will be available in future Firebird
On the contrary, I hope that it will be. Because this parameter is not a solution to the problem, but a crutch. I would prefer real hints for hints to the optimizer. And it would be even better if he could do without hints at all, but this is from the realm of fiction, so even in DBMS with good optimizers there are hints
Actually, PostgreSQL doesn't allow optimizer hints. If you ask for hints to make the query planner do X, the developers tell you to send them the query instead and they'll make sure the query planner does X from now on! :-)
So far we were doing the same ;-) But it cannot solve all the issues, because the optimizer intelligence is limited, regardless of how good it is.