HBASE-29734 Fix miscalculated maxWait timeout in MoveWithAck
This PR fixes the miscalculated maxWait in MoveWithAck using current time instead of sartTime
In MoveWithAck, the maxWait timeout was calculated inside the retry loop using a fixed startTime calculated at the out side of the retry loop, causing it to remain constant across all retry attempts. This means that each retry didn't really check isSameServer and marks all retries failed.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 39s | Docker mode activated. | |
| _ Prechecks _ | ||||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | |
| +0 :ok: | codespell | 0m 0s | codespell was not available. | |
| +0 :ok: | detsecrets | 0m 0s | detect-secrets was not available. | |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | |
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 32s | master passed | |
| +1 :green_heart: | compile | 4m 0s | master passed | |
| +1 :green_heart: | checkstyle | 1m 11s | master passed | |
| +1 :green_heart: | spotbugs | 2m 40s | master passed | |
| +1 :green_heart: | spotless | 1m 31s | branch has no errors when running spotless:check. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 5m 25s | the patch passed | |
| +1 :green_heart: | compile | 5m 24s | the patch passed | |
| +1 :green_heart: | javac | 5m 24s | the patch passed | |
| +1 :green_heart: | blanks | 0m 0s | The patch has no blanks issues. | |
| +1 :green_heart: | checkstyle | 1m 33s | the patch passed | |
| +1 :green_heart: | spotbugs | 2m 50s | the patch passed | |
| +1 :green_heart: | hadoopcheck | 15m 56s | Patch does not cause any errors with Hadoop 3.3.6 3.4.1. | |
| +1 :green_heart: | spotless | 1m 0s | patch has no errors when running spotless:check. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | asflicense | 0m 12s | The patch does not generate ASF License warnings. | |
| 57m 31s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/1/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/7488 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 916a4ef3fcd9 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / cacc56108d7e789a2309b355d753606e392d0608 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 84 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/1/console |
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 36s | Docker mode activated. | |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck | |
| _ Prechecks _ | ||||
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 5m 8s | master passed | |
| +1 :green_heart: | compile | 1m 27s | master passed | |
| +1 :green_heart: | javadoc | 0m 40s | master passed | |
| +1 :green_heart: | shadedjars | 7m 23s | branch has no errors when building our shaded downstream artifacts. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 38s | the patch passed | |
| +1 :green_heart: | compile | 1m 9s | the patch passed | |
| +1 :green_heart: | javac | 1m 9s | the patch passed | |
| +1 :green_heart: | javadoc | 0m 35s | the patch passed | |
| +1 :green_heart: | shadedjars | 7m 16s | patch has no errors when building our shaded downstream artifacts. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | unit | 248m 38s | hbase-server in the patch passed. | |
| 282m 8s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/7488 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux 5b15f35fb97b 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / cacc56108d7e789a2309b355d753606e392d0608 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/1/testReport/ |
| Max. process+thread count | 3506 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/1/console |
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
It is bit confusing. There are two situations. Whether we want to wait for max move time fro entire region mover or per move. Last code was considering the entire move operations and your code change it to per move. Now i am not sure what is the real intention out of two. Have to check the code once.
@mnpoonia Thanks for the review. In the current code, the wait loop is inside the retry loop, so maxWait should be per-retry. If maxWait were a single absolute deadline for all retries, once the first retry waits up to that deadline, subsequent retries would skip the wait loop because the current time already exceeds the deadline. As a result, isSameServer(...) would not be invoked again, sameServer would never update after the first retry, and the move would fail regardless of later attempts.
I think the logic after the PR is more reasonable, although we lose the ability to limit the whole retry time with a single configuration, but you can still control it with maxRetries * maxWait.
So I'm +1 on the change.
Let me request review for some other committer to see if there are other opinion.
Thanks.
I am +1 , it looks like previous retry logic didn't work
you can still control it with maxRetries * maxWait
👍
Thanks all for reviewing the PR. I have raised a related PR for a change in RegionMover to fix timeoutInSeconds calculation to consider retry attempts in MoveWithAck
+1 LGTM
I agree that the current code does not make sense.
This does change the semantics of the timeout. Maybe merge this with #7519 ? I think that a release note should also be added.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 0s | Docker mode activated. | |
| -1 :x: | docker | 0m 4s | Docker failed to build run-specific yetus/hbase:tp-6897}. |
| Subsystem | Report/Notes |
|---|---|
| GITHUB PR | https://github.com/apache/hbase/pull/7488 |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/2/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 44s | Docker mode activated. | |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck | |
| _ Prechecks _ | ||||
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 5m 18s | master passed | |
| +1 :green_heart: | compile | 1m 16s | master passed | |
| +1 :green_heart: | javadoc | 0m 40s | master passed | |
| +1 :green_heart: | shadedjars | 6m 58s | branch has no errors when building our shaded downstream artifacts. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 12s | the patch passed | |
| +1 :green_heart: | compile | 1m 20s | the patch passed | |
| +1 :green_heart: | javac | 1m 20s | the patch passed | |
| +1 :green_heart: | javadoc | 0m 42s | the patch passed | |
| +1 :green_heart: | shadedjars | 6m 48s | patch has no errors when building our shaded downstream artifacts. | |
| _ Other Tests _ | ||||
| -1 :x: | unit | 17m 14s | /patch-unit-hbase-server.txt | hbase-server in the patch failed. |
| 47m 31s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/7488 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux d0e59812cdbe 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 6531542457748f5055cdbd6eb8726410f058456d |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/2/testReport/ |
| Max. process+thread count | 1111 (vs. ulimit of 30000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7488/2/console |
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.