HBASE-28192 Master should recover if meta region state is inconsistent
Jira: HBASE-28192
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 33s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 4m 8s | master passed |
| +1 :green_heart: | compile | 3m 5s | master passed |
| +1 :green_heart: | checkstyle | 0m 56s | master passed |
| +1 :green_heart: | spotless | 1m 4s | branch has no errors when running spotless:check. |
| +1 :green_heart: | spotbugs | 2m 11s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 3m 28s | the patch passed |
| +1 :green_heart: | compile | 2m 57s | the patch passed |
| +1 :green_heart: | javac | 2m 57s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 48s | the patch passed |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | hadoopcheck | 13m 37s | Patch does not cause any errors with Hadoop 3.2.4 3.3.6. |
| +1 :green_heart: | spotless | 1m 3s | patch has no errors when running spotless:check. |
| +1 :green_heart: | spotbugs | 2m 17s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | asflicense | 0m 13s | The patch does not generate ASF License warnings. |
| 44m 46s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/5513 |
| Optional Tests | dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile |
| uname | Linux 85bbcd0830eb 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 7f3921ae40 |
| Default Java | Eclipse Adoptium-11.0.17+8 |
| Max. process+thread count | 83 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/console |
| versions | git=2.34.1 maven=3.8.6 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 27s | Docker mode activated. |
| -0 :warning: | yetus | 0m 2s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck |
| _ Prechecks _ | |||
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 36s | master passed |
| +1 :green_heart: | compile | 0m 37s | master passed |
| +1 :green_heart: | shadedjars | 5m 23s | branch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 23s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 19s | the patch passed |
| +1 :green_heart: | compile | 0m 38s | the patch passed |
| +1 :green_heart: | javac | 0m 38s | the patch passed |
| +1 :green_heart: | shadedjars | 5m 22s | patch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 21s | the patch passed |
| _ Other Tests _ | |||
| -1 :x: | unit | 256m 21s | hbase-server in the patch failed. |
| 278m 16s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/5513 |
| Optional Tests | javac javadoc unit shadedjars compile |
| uname | Linux 8e84fa2d2746 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 7f3921ae40 |
| Default Java | Temurin-1.8.0_352-b08 |
| unit | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/testReport/ |
| Max. process+thread count | 4518 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/console |
| versions | git=2.34.1 maven=3.8.6 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 43s | Docker mode activated. |
| -0 :warning: | yetus | 0m 4s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck |
| _ Prechecks _ | |||
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 3m 32s | master passed |
| +1 :green_heart: | compile | 1m 2s | master passed |
| +1 :green_heart: | shadedjars | 5m 48s | branch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 31s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 3m 14s | the patch passed |
| +1 :green_heart: | compile | 0m 49s | the patch passed |
| +1 :green_heart: | javac | 0m 49s | the patch passed |
| +1 :green_heart: | shadedjars | 4m 57s | patch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 26s | the patch passed |
| _ Other Tests _ | |||
| -1 :x: | unit | 256m 42s | hbase-server in the patch failed. |
| 282m 3s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/5513 |
| Optional Tests | javac javadoc unit shadedjars compile |
| uname | Linux 0ba8245296d2 5.4.0-163-generic #180-Ubuntu SMP Tue Sep 5 13:21:23 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 7f3921ae40 |
| Default Java | Eclipse Adoptium-11.0.17+8 |
| unit | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/testReport/ |
| Max. process+thread count | 5086 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/1/console |
| versions | git=2.34.1 maven=3.8.6 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 32s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 53s | master passed |
| +1 :green_heart: | compile | 2m 26s | master passed |
| +1 :green_heart: | checkstyle | 0m 36s | master passed |
| +1 :green_heart: | spotless | 0m 43s | branch has no errors when running spotless:check. |
| +1 :green_heart: | spotbugs | 1m 33s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 35s | the patch passed |
| +1 :green_heart: | compile | 2m 25s | the patch passed |
| +1 :green_heart: | javac | 2m 25s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 35s | the patch passed |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | hadoopcheck | 9m 27s | Patch does not cause any errors with Hadoop 3.2.4 3.3.6. |
| +1 :green_heart: | spotless | 0m 41s | patch has no errors when running spotless:check. |
| +1 :green_heart: | spotbugs | 1m 36s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | asflicense | 0m 12s | The patch does not generate ASF License warnings. |
| 31m 53s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/5513 |
| Optional Tests | dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile |
| uname | Linux e069f5f58e6c 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 7f3921ae40 |
| Default Java | Eclipse Adoptium-11.0.17+8 |
| Max. process+thread count | 79 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/console |
| versions | git=2.34.1 maven=3.8.6 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
Unless we know what is root cause I'm always -1 for doing things like this in our normal code logic. HBCK is the correct way for fixing the incosistency which is caused by a code bug.
So why there is no SCP for the old server after it is already dead?
Added some comments on Jira, still it's suspicious, not a guaranteed root cause and maybe this can happen only during upgrade from 2.4 to 2.5? Let me check what happened to SCP of old server.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 25s | Docker mode activated. |
| -0 :warning: | yetus | 0m 2s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck |
| _ Prechecks _ | |||
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 3m 8s | master passed |
| +1 :green_heart: | compile | 0m 46s | master passed |
| +1 :green_heart: | shadedjars | 5m 29s | branch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 25s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 52s | the patch passed |
| +1 :green_heart: | compile | 0m 46s | the patch passed |
| +1 :green_heart: | javac | 0m 46s | the patch passed |
| +1 :green_heart: | shadedjars | 5m 29s | patch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 22s | the patch passed |
| _ Other Tests _ | |||
| -1 :x: | unit | 236m 15s | hbase-server in the patch failed. |
| 260m 3s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/5513 |
| Optional Tests | javac javadoc unit shadedjars compile |
| uname | Linux 04f2cef806db 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 7f3921ae40 |
| Default Java | Eclipse Adoptium-11.0.17+8 |
| unit | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/testReport/ |
| Max. process+thread count | 4748 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/console |
| versions | git=2.34.1 maven=3.8.6 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 12s | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck |
| _ Prechecks _ | |||
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 32s | master passed |
| +1 :green_heart: | compile | 0m 41s | master passed |
| +1 :green_heart: | shadedjars | 4m 51s | branch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 25s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 20s | the patch passed |
| +1 :green_heart: | compile | 0m 41s | the patch passed |
| +1 :green_heart: | javac | 0m 41s | the patch passed |
| +1 :green_heart: | shadedjars | 4m 52s | patch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 24s | the patch passed |
| _ Other Tests _ | |||
| -1 :x: | unit | 244m 59s | hbase-server in the patch failed. |
| 266m 7s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/5513 |
| Optional Tests | javac javadoc unit shadedjars compile |
| uname | Linux c9935834d86f 5.4.0-153-generic #170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 7f3921ae40 |
| Default Java | Temurin-1.8.0_352-b08 |
| unit | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/testReport/ |
| Max. process+thread count | 4661 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5513/2/console |
| versions | git=2.34.1 maven=3.8.6 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
I would definitely prefer automated solutions rather than rely on HBCK. IMO anything requiring hbck is a bug.
I would definitely prefer automated solutions rather than rely on HBCK. IMO anything requiring hbck is a bug.
I agree.
For this case, we now know that the root cause was (upgrade to 2.5 + downgrade to 2.4 + meta move + upgrade to 2.5) and hence master did not have correct server address in master local region.
I wonder if there is anything else that could also ever cause this problem.
I would definitely prefer automated solutions rather than rely on HBCK. IMO anything requiring hbck is a bug.
I agree.
For this case, we now know that the root cause was (upgrade to 2.5 + downgrade to 2.4 + meta move + upgrade to 2.5) and hence master did not have correct server address in master local region.
I wonder if there is anything else that could also ever cause this problem.
Could you explain more on the root cause? Why this could cause this problem? Because the downgrading to 2.4 does not do all the necessary rollbacks?
Correct, downgrading to 2.4 does not remove meta's address from master local region's info:sn. Hence, any downgrade from 2.5 to older versions has this risk: it neither removes info CF from master local region, nor do they use master local region to update meta location (since HBASE-26193 is only applicable to 2.5.0+ releases).
Correct, downgrading to 2.4 does not remove meta's address from master local region's info:sn. Hence, any downgrade from 2.5 to older versions has this risk: it neither removes info CF from master local region, nor do they use master local region to update meta location (since HBASE-26193 is only applicable to 2.5.0+ releases).
So I think we should add a tool in HBCK2 to for deleting data from master local region? Or at least, remove meta locations from master local region. So after downgrading to 2.4, we need a manual step to remove the location.