hbase icon indicating copy to clipboard operation
hbase copied to clipboard

HBASE-28697 Don't clean bulk load system entries until backup is complete

Open rmdmattingly opened this issue 1 year ago • 3 comments

https://issues.apache.org/jira/browse/HBASE-28697

I've been thinking through the incremental backup order of operations, and I think we delete rows from the bulk loads system table too early and, consequently, make it possible to produce a "successful" incremental backup that is missing bulk loads.

To summarize the steps here, starting in IncrementalTableBackupCilent#execute:

  1. We take an incremental backup of the WALs generated since the last backup
  2. We ensure any bulk loads done since the last backup are appropriately represented in the new backup by going through the system table and copying the appropriate files to the backup directory
  3. We delete all of the system table rows which told us about these bulk loads
  4. We generate a backup manifest and mark the backup as complete
  5. If we began deleting any of the system table rows regarding bulk loads, but fail in steps 3 and 4 before we are able to mark the backup as complete, then we'll be in a precarious spot. If we retry an incremental backup then it may succeed, but it would not know to persist the bulk loaded files for which we have already deleted system table references.

We could consider this issue an extension or replacement of https://issues.apache.org/jira/browse/HBASE-28084 in some ways, depending on what solution we land on. I think that we could fix this specific issue by reordering the bulk load table cleanup, but there will always be gotchas like this. Maybe it is simpler to require that the next backup be a full backup after any incremental failure.

cc @hgromer @ndimiduk @DieterDP-ng

rmdmattingly avatar Jul 17 '24 20:07 rmdmattingly

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 :green_heart: mvninstall 3m 45s master passed
+1 :green_heart: compile 0m 34s master passed
+1 :green_heart: checkstyle 0m 12s master passed
+1 :green_heart: spotbugs 0m 37s master passed
+1 :green_heart: spotless 0m 51s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 3m 29s the patch passed
+1 :green_heart: compile 0m 32s the patch passed
+1 :green_heart: javac 0m 32s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: checkstyle 0m 10s the patch passed
+1 :green_heart: spotbugs 0m 43s the patch passed
+1 :green_heart: hadoopcheck 12m 18s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 :green_heart: spotless 0m 50s patch has no errors when running spotless:check.
_ Other Tests _
+1 :green_heart: asflicense 0m 10s The patch does not generate ASF License warnings.
32m 23s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR https://github.com/apache/hbase/pull/6089
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 5a5e33cf3512 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d872aad1f9a9731b9f5e0d900a8e877a02ec769
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 81 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase avatar Jul 17 '24 20:07 Apache-HBase

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 17s Docker mode activated.
-0 :warning: yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 :green_heart: mvninstall 4m 39s master passed
+1 :green_heart: compile 0m 29s master passed
+1 :green_heart: javadoc 0m 20s master passed
+1 :green_heart: shadedjars 6m 14s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 3m 6s the patch passed
+1 :green_heart: compile 0m 25s the patch passed
+1 :green_heart: javac 0m 25s the patch passed
+1 :green_heart: javadoc 0m 17s the patch passed
+1 :green_heart: shadedjars 5m 53s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 :green_heart: unit 10m 26s hbase-backup in the patch passed.
33m 16s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR https://github.com/apache/hbase/pull/6089
Optional Tests javac javadoc unit compile shadedjars
uname Linux ec260b53bacd 5.4.0-182-generic #202-Ubuntu SMP Fri Apr 26 12:29:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d872aad1f9a9731b9f5e0d900a8e877a02ec769
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/testReport/
Max. process+thread count 3174 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6089/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase avatar Jul 17 '24 20:07 Apache-HBase

@DieterDP-ng any thoughts on this PR?

rmdmattingly avatar Aug 27 '24 15:08 rmdmattingly

@DieterDP-ng any thoughts on this PR?

Sorry for the late reply. These changes look OK to me.

This seems alright to me but I'd appreciate to hear from another voice with familiarity in this code path. Specifically, are there semantic implications in other parts of the backup system for having the completBackup called before the deletes occur?

I'm not aware of any.

DieterDP-ng avatar Sep 05 '24 10:09 DieterDP-ng