HBASE-28697 Don't clean bulk load system entries until backup is complete
https://issues.apache.org/jira/browse/HBASE-28697
I've been thinking through the incremental backup order of operations, and I think we delete rows from the bulk loads system table too early and, consequently, make it possible to produce a "successful" incremental backup that is missing bulk loads.
To summarize the steps here, starting in IncrementalTableBackupCilent#execute:
- We take an incremental backup of the WALs generated since the last backup
- We ensure any bulk loads done since the last backup are appropriately represented in the new backup by going through the system table and copying the appropriate files to the backup directory
- We delete all of the system table rows which told us about these bulk loads
- We generate a backup manifest and mark the backup as complete
- If we began deleting any of the system table rows regarding bulk loads, but fail in steps 3 and 4 before we are able to mark the backup as complete, then we'll be in a precarious spot. If we retry an incremental backup then it may succeed, but it would not know to persist the bulk loaded files for which we have already deleted system table references.
We could consider this issue an extension or replacement of https://issues.apache.org/jira/browse/HBASE-28084 in some ways, depending on what solution we land on. I think that we could fix this specific issue by reordering the bulk load table cleanup, but there will always be gotchas like this. Maybe it is simpler to require that the next backup be a full backup after any incremental failure.
cc @hgromer @ndimiduk @DieterDP-ng
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 30s | Docker mode activated. | |
| _ Prechecks _ | ||||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | |
| +0 :ok: | codespell | 0m 0s | codespell was not available. | |
| +0 :ok: | detsecrets | 0m 0s | detect-secrets was not available. | |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | |
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 45s | master passed | |
| +1 :green_heart: | compile | 0m 34s | master passed | |
| +1 :green_heart: | checkstyle | 0m 12s | master passed | |
| +1 :green_heart: | spotbugs | 0m 37s | master passed | |
| +1 :green_heart: | spotless | 0m 51s | branch has no errors when running spotless:check. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 29s | the patch passed | |
| +1 :green_heart: | compile | 0m 32s | the patch passed | |
| +1 :green_heart: | javac | 0m 32s | the patch passed | |
| +1 :green_heart: | blanks | 0m 0s | The patch has no blanks issues. | |
| +1 :green_heart: | checkstyle | 0m 10s | the patch passed | |
| +1 :green_heart: | spotbugs | 0m 43s | the patch passed | |
| +1 :green_heart: | hadoopcheck | 12m 18s | Patch does not cause any errors with Hadoop 3.3.6 3.4.0. | |
| +1 :green_heart: | spotless | 0m 50s | patch has no errors when running spotless:check. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | asflicense | 0m 10s | The patch does not generate ASF License warnings. | |
| 32m 23s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/6089 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 5a5e33cf3512 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 5d872aad1f9a9731b9f5e0d900a8e877a02ec769 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 81 (vs. ulimit of 30000) |
| modules | C: hbase-backup U: hbase-backup |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/console |
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 17s | Docker mode activated. | |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck | |
| _ Prechecks _ | ||||
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 39s | master passed | |
| +1 :green_heart: | compile | 0m 29s | master passed | |
| +1 :green_heart: | javadoc | 0m 20s | master passed | |
| +1 :green_heart: | shadedjars | 6m 14s | branch has no errors when building our shaded downstream artifacts. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 6s | the patch passed | |
| +1 :green_heart: | compile | 0m 25s | the patch passed | |
| +1 :green_heart: | javac | 0m 25s | the patch passed | |
| +1 :green_heart: | javadoc | 0m 17s | the patch passed | |
| +1 :green_heart: | shadedjars | 5m 53s | patch has no errors when building our shaded downstream artifacts. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | unit | 10m 26s | hbase-backup in the patch passed. | |
| 33m 16s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/6089 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux ec260b53bacd 5.4.0-182-generic #202-Ubuntu SMP Fri Apr 26 12:29:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 5d872aad1f9a9731b9f5e0d900a8e877a02ec769 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/testReport/ |
| Max. process+thread count | 3174 (vs. ulimit of 30000) |
| modules | C: hbase-backup U: hbase-backup |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/console |
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
@DieterDP-ng any thoughts on this PR?
@DieterDP-ng any thoughts on this PR?
Sorry for the late reply. These changes look OK to me.
This seems alright to me but I'd appreciate to hear from another voice with familiarity in this code path. Specifically, are there semantic implications in other parts of the backup system for having the
completBackupcalled before the deletes occur?
I'm not aware of any.