firebird icon indicating copy to clipboard operation
firebird copied to clipboard

UPDATE must not touch fields that are not explicitly mentioned

Open ppazera opened this issue 8 months ago • 3 comments

we are trying to migrate from firebird v2.5.9 to v5.0.2 (classic, windows and linux) but noticed one critical issue with bulk updates which are stopping us from upgrading

bulk update is executed to change one field, trigger updates other field in other records of the same table, and then bulk update overwrites those new values with old ones, for fields which should not be updated by outer bulk update at all. Changes done by trigger are silently lost in firebird v5, backwards compatibility is broken (see simplified example testBulkUpdate_after_update.txt)

although when updating records individually (with multiple update statements, rather than single bulk update statement) everything works as expected, all updates done by trigger are intact. Such inconsistency when bulk update started to work differently than individual updates in firebird v5 is also bad

depending on the order in which bulk update processes individual records, and on which records trigger updates, some changes done by trigger may stay, and some may be lost (bulk updates usually don't have 'order by' clause, thus order is not guaranteed by firebird). Such randomness is also bad, our tests in house may run fine, but in production it may start working incorrectly

Both MS SQL Server and PostgreSQL work like firebird v2.5 does. Behavior of firebird v5 is unique, I was not able to find leading RDBMS with such behavior (oracle and mariaDB errors out at runtime, that allows to find places to rework, but I believe firebird can do much better than that)

we had a discussion in firebird support forum, more info if needed could be found there. I was told that this behavior change was introduced while fixing infinite insert-select issue (core92 or #417). Unfortunately I was not provided an example which particular bulk update issue it fixed. Assuming there were no issues with bulk updates in firebird v2.5, and this insert-select fix accidentally broke bulk updates, would it be possible to remove that fix from bulk updates?

otherwise, if there were some issues with bulk updates, would it be possible to review them and improve the fix so that it fixes those issues but does not break bulk update behavior described here? if that fix can not be improved, then maybe new firebird.conf option could be implemented to allow users to restore legacy behavior? (similar to ReadConsistency which we also needed to make read committed to become read committed again)

Thanks much, Paulius (GLDS)

ppazera avatar Apr 25 '25 12:04 ppazera

In short: OLD context for explicitly updated fields should be from stable cursor's record version while OLD (and NEW) context for non-updated fields should be from the latest record version.

PS: I still insist that a good database design must not rely on such implementation details but we cannot expect every DB developer to be experienced (or even sane).

aafemt avatar Apr 25 '25 13:04 aafemt

I already responded in firebird-support, but here it is for posterity

The outer bulk update does NOT resets the 'amt' to prior values, it simply does not see the changes made by trigger in other rows.

The cursor stability over single statement is basic requirement of SQL standard, which Firebird violated from the beginning (because it predates the SQL standard). This violation allowed behavior (loophole) that you have exploited, pitfalls like "insert into A ... select * from A", and of course anomalies in simple selects in READ COMMITTED transactions.

BTW, not consistent read problem that plagued Firebird (and was source of frequent user complaints) was well known, and users were always warned to do not exploit it (like updating different rows in the same table from triggers) as some day, the Firebird developers will fix it. And it was eventually fixed in v4.

If you look at the documentation from v4, section Solving the Inconsistent Read Problem, you can read:

The obvious solution to not consistent read problem is to make read-committed transaction to use 
stable database snapshot while statement is executed. Each new top-level statement create own database 
snapshot to see data committed recently. With snapshots based on commit order it is very cheap operation. 
Let name this snapshot as statement-level snapshot further. Nested statements (triggers, nested stored 
procedures and functions, dynamic statements, etc) uses same statement-level snapshot created by top-level 
statement.

So, the outer (bulk) update creates its "snapshot" and works over records that are visible to it, creating record version that does not belong to this snapshot (it's yet uncommitted change!). The update trigger however changes some row(s) in the same table as well. These updates from trigger use the same snapshot to identify records to work with as the outer bulk update, but as outer update, they create versions that belong to different snapshot. Hence the outer update can't see changes created by trigger in the same table.

The update operation does not touch the non-updated columns, but it operates on record buffer (version) that is stable over the execution of SINGLE statement as it comes from the snapshot created for the bulk update. It fetches the record from stable snapshot, updates changed values and writes it back. If the written record has newer version created by trigger, it's overwritten and this update is lost.

You may think that this could be solved by fetching record from later snapshot (created and shared by trigger and the bulk update) from the same transaction, but I don't think it's a solution, because the trigger CAN update (although your example didn't) the column used in outer update WHERE filter, which will cause weird behavior (the outer update may skip the record) and we'll be back at the square one again with inconsistent non-repeatable results.

Other servers that does not have record level triggers but table level ones (SQLServer) are not affected by this, as the trigger is executed once after the table is updated to process the updated rows.

Because your algorithm leverages the non-SQL-standard behavior, you can't expect that the engine would open this loophole again for you. Hence you need to adjust you recalculation algorithm in accordance to behavior specified by SQL standard. You have several options:

  1. As suggested earlier, you can split the table to two, with columns updated by outer bulk update in one and columns updated by trigger in other one, and create a view to retain selects that operate on previous united table.

  2. Replace your bulk update & trigger with procedure that would use FOR SELECT that will fetch PK's (or RDB$DB_KEY's for faster processing) and use them to execute individual updates for rows as necessary.

Personally, I'd select the option 2 with dbkeys.

pcisar avatar Apr 25 '25 17:04 pcisar

The issue required additional investigation.

hvlad avatar May 08 '25 16:05 hvlad