firebird icon indicating copy to clipboard operation
firebird copied to clipboard

[FR] Database restore must make every effort on activating deferred indexes

Open basid-irk opened this issue 2 years ago • 14 comments

Database restore are aborted on first error of first problematic deferred index. Database restore must make every effort on activating deferred indexes - build them all and show full problem list for each (#7268)

basid-irk avatar Aug 18 '22 10:08 basid-irk

I've made a confirmation test. Backup rest_bad.fbk is attached. Backup format - Firebird 2.5, it is ok to restore it on 3.0. Here are 6 tables, 3 master and 3 detail. A1 have duplicate value in primary key A2 have "record loss" that B2 foreign key points to A3 have duplicate value in primary key (as A1, but different value)

This backup was made after turning rdb$index_active to "3" and adding/deleting rows.

Currently on restore (2.5, 3.0, 4.0 was not tested) we have stop after FIRST problem with index gbak:creating indexes gbak:committing metadata gbak: activating and creating deferred index PK_B3 gbak: activating and creating deferred index PK_B2 gbak: activating and creating deferred index PK_B1 gbak: activating and creating deferred index PK_A3 gbak:cannot commit index PK_A3 gbak: ERROR:violation of PRIMARY or UNIQUE KEY constraint "PK_A3" on table "A3" gbak: ERROR: Problematic key value is ("ID" = 9) gbak: ERROR:action cancelled by trigger (2) to preserve data integrity gbak: ERROR: Cannot deactivate index used by an integrity constraint gbak:Exiting before completion due to errors

None indices after that are not activated, they still stay in "inactive" state (value 3 in rdb$index_inactive).

If we could have continuation of activating indices, there would be only 3 indices inactive - A1 PK, A3 PK, B2 FK.

Note that this kind of a problem can be fixed only by repetitive actions (b/r, or restore -i and activating indices one by one), which can take very LONG time on big databases. Ability to have inactive only problematic indices may save lot of time in such cases.

rtest_bad.zip

ibaseru avatar Aug 18 '22 12:08 ibaseru

If implemented, it should not be as a default action, but some kind of switch.

livius2 avatar Aug 24 '22 09:08 livius2

If implemented, it should not be as a default action, but some kind of switch.

What kind of incompatibility (or any side effect) you expect, if this will be default action? The initial issue does not expect the error to disappear. Current behavior forces admin to do more work on restored DB - to manually activate indices that were not activated by restore. Who ever will expect that "lot of indices inactive" is better than "as much as possible indices are activated"?

ibaseru avatar Aug 24 '22 09:08 ibaseru

If implemented, it should not be as a default action, but some kind of switch.

well, you won, I changed my mind, let it be "some kind of switch".

ibaseru avatar Aug 24 '22 10:08 ibaseru

The restore code already tries to activate (build) as much indices as possible. It handle index activation errors and continues to the next index, setting RDB$INDEX_INACTIVE = TRUE for the failed index. And here we have a bug - if failed index is system index (such as PK or FK) then such update is prevented by system trigger and whole restore process is stopped.

I.e. there is no question if "this" should be default action or not - it is default action already.

hvlad avatar Aug 24 '22 10:08 hvlad

But the restore itself at the end will fail or will be succesfull? For me it should fail.

But if this is the default action already i will see needs for swith to turn it off. I do not like to spend time on bad restore, only it should fail for me immidietly.

livius2 avatar Aug 24 '22 10:08 livius2

In such case restore leave database in shutdown state, report:

Database is not online due to failure to activate one or more indices.
Run gfix -online to bring database online without active indices.

and exit with code 2

hvlad avatar Aug 24 '22 10:08 hvlad

Exit code is ok then. But i see that switch to stop restore immidietly will be also helpfull to do not spend time if someone do not need this.

livius2 avatar Aug 24 '22 10:08 livius2

and exit with code 2

Is it indicated in status vector for services API as well or one have to parse text output?

aafemt avatar Aug 24 '22 11:08 aafemt

AFAIU, exit code of service routine never reported back to the service user. I.e., from service manager POV, there is no error. And this is not unique\specific for gbak services.

hvlad avatar Aug 24 '22 11:08 hvlad

It's prety easy to make all non-zero exit codes be reported as errors in services. If really needed.

AlexPeshkoff avatar Aug 24 '22 11:08 AlexPeshkoff

Without it services seem to be pretty useless.

aafemt avatar Aug 24 '22 11:08 aafemt

On 8/24/22 14:31, Dimitry Sibiryakov wrote:

Without it services seem to be pretty useless.

I think this to be suggested in devel and if nobody sees problems - feel free to add a ticket, I will implement it. Notice - in status vector it will be a notice that service thread failed with exit code NNN, original message with a reason for is unknown at the final point.

AlexPeshkoff avatar Aug 24 '22 17:08 AlexPeshkoff

I've made a confirmation test. Backup rest_bad.fbk is attached.

Please tell me how to make such a test database? How to “break” it? You cannot directly deactivate a primary key.

GeF0rs avatar Jun 28 '24 09:06 GeF0rs