HBASE-28931: Reload SSL context regardless of changed file path
At my company we have an issue with our HBase servers not reloading TLS certificate files after they change on disk. We run our HMasters inside Kubernetes Pods, and define our certificate contents as Kubernetes Secrets. Then, the Secrets are projected into the HMaster containers as files. When the value of a Secret changes, the file changes automatically. However, Kubernetes does some complicated indirection, and does not change the files directly. It swaps a new directory in with new files in it.
HBase sets up a WatchService on the directory containing the TLS cert. For example, at my company, the cert is at /etc/hadoop/conf/ssl/cert/server-chain.pem. Then, events from that WatchService are delivered to a handler method which contains this check:
Path eventFilePath = dirPath.resolve((Path) event.context());
if (filePath.equals(eventFilePath)) {
shouldResetContext = true;
}
Debug logs show why this conditional is never true:
2024-10-21T17:48:13,659 [FileChangeWatcher-server-chain.pem] DEBUG org.apache.hadoop.hbase.io.FileChangeWatcher: Got file changed event: ENTRY_CREATE with context: ..2024_10_21_17_48_13.2471317370
2024-10-21T17:48:13,659 [FileChangeWatcher-server-chain.pem] DEBUG org.apache.hadoop.hbase.io.FileChangeWatcher: Got file changed event: ENTRY_CREATE with context: ..2024_10_21_17_48_13.2471317370
2024-10-21T17:48:13,660 [FileChangeWatcher-server-chain.pem] DEBUG org.apache.hadoop.hbase.io.crypto.tls.X509Util: Ignoring watch event and keeping previous default SSL context. Event kind: ENTRY_CREATE with context: ..2024_10_21_17_48_13.2471317370
....
The watcher events have a variety of files attached to them, but none of them are server-chain.pem, so HBase thinks they are not relevant.
I propose that we simply remove the condition inspecting the file name that was changed, and always reload the SSL context if a watcher event fires. This may lead to unnecessary reloads, but that will be harmless.
cc @bbeaudreault
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 47s | Docker mode activated. | |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck | |
| _ Prechecks _ | ||||
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 13s | master passed | |
| +1 :green_heart: | compile | 0m 24s | master passed | |
| +1 :green_heart: | javadoc | 0m 24s | master passed | |
| +1 :green_heart: | shadedjars | 6m 0s | branch has no errors when building our shaded downstream artifacts. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 23s | the patch passed | |
| +1 :green_heart: | compile | 0m 23s | the patch passed | |
| +1 :green_heart: | javac | 0m 23s | the patch passed | |
| +1 :green_heart: | javadoc | 0m 19s | the patch passed | |
| +1 :green_heart: | shadedjars | 6m 38s | patch has no errors when building our shaded downstream artifacts. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | unit | 2m 32s | hbase-common in the patch passed. | |
| 26m 19s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/6381 |
| JIRA Issue | HBASE-28931 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux e1e024f342bc 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 976a2bed4cff5bf0c8d9a8d9afd129c545f8c0c9 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/testReport/ |
| Max. process+thread count | 410 (vs. ulimit of 30000) |
| modules | C: hbase-common U: hbase-common |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/console |
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 35s | Docker mode activated. | |
| _ Prechecks _ | ||||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | |
| +0 :ok: | codespell | 0m 0s | codespell was not available. | |
| +0 :ok: | detsecrets | 0m 0s | detect-secrets was not available. | |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | |
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 57s | master passed | |
| +1 :green_heart: | compile | 1m 3s | master passed | |
| +1 :green_heart: | checkstyle | 0m 21s | master passed | |
| +1 :green_heart: | spotbugs | 0m 54s | master passed | |
| +1 :green_heart: | spotless | 1m 3s | branch has no errors when running spotless:check. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 36s | the patch passed | |
| +1 :green_heart: | compile | 1m 11s | the patch passed | |
| +1 :green_heart: | javac | 1m 11s | the patch passed | |
| +1 :green_heart: | blanks | 0m 0s | The patch has no blanks issues. | |
| +1 :green_heart: | checkstyle | 0m 24s | hbase-common: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) | |
| +1 :green_heart: | spotbugs | 1m 8s | the patch passed | |
| +1 :green_heart: | hadoopcheck | 17m 3s | Patch does not cause any errors with Hadoop 3.3.6 3.4.0. | |
| +1 :green_heart: | spotless | 0m 57s | patch has no errors when running spotless:check. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | asflicense | 0m 12s | The patch does not generate ASF License warnings. | |
| 44m 0s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/6381 |
| JIRA Issue | HBASE-28931 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 860f3e36b0dc 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 976a2bed4cff5bf0c8d9a8d9afd129c545f8c0c9 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 83 (vs. ulimit of 30000) |
| modules | C: hbase-common U: hbase-common |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/console |
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
I'm not sure if the extra reloads actually would be harmless. I've watched these logs before and noticed many ignored events. We need to better understand the impact of, for example, every connection getting a different ssl context. Because I've seen volumes of ignored events which could effectively cause close to that.
Cc @anmolnar @Apache9
@charlesconnell
If I remember correctly the Watcher object watches the folder, not the files, so if the folder contains lots of other files too, the impact of unnecessary reload events will be signifact.
Are you sure there's no way to match the events with the files somehow?
Is this the same issue which described in this blog: https://blog.arkey.fr/2019/09/13/watchservice-and-bind-mount/
Either way I like the approach (1) at the end of the article:
- Either poll the file regularly
We could implement that as a fallback option if WatchService doesn't work for some reason.
I would say the issue with Kubernetes secrets mounted in containers is a little bit different than the one mentioned in that blog post. However, simply polling the file of interest for changes is definitely a solution in both cases. The current FileChangeWatcher watches a directory, because that's the only behavior that Java's WatchService supports. But FileChangeWatcher is only used in one place, and it's used to effectively watch a single file. So, I've changed FileChangeWatcher to just poll the one file we care about. This simplifies the relevant code and fixes the issue I reported.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 4m 4s | Docker mode activated. | |
| -0 :warning: | yetus | 0m 2s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck | |
| _ Prechecks _ | ||||
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 3s | master passed | |
| +1 :green_heart: | compile | 0m 22s | master passed | |
| +1 :green_heart: | javadoc | 0m 19s | master passed | |
| +1 :green_heart: | shadedjars | 5m 23s | branch has no errors when building our shaded downstream artifacts. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 3m 1s | the patch passed | |
| +1 :green_heart: | compile | 0m 20s | the patch passed | |
| +1 :green_heart: | javac | 0m 20s | the patch passed | |
| +1 :green_heart: | javadoc | 0m 16s | the patch passed | |
| +1 :green_heart: | shadedjars | 5m 19s | patch has no errors when building our shaded downstream artifacts. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | unit | 2m 29s | hbase-common in the patch passed. | |
| 25m 50s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/6381 |
| JIRA Issue | HBASE-28931 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux 4f3adeebb868 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / ac5328c69e7207b80d61d0551ef963349266413a |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/testReport/ |
| Max. process+thread count | 417 (vs. ulimit of 30000) |
| modules | C: hbase-common U: hbase-common |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/console |
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Logfile | Comment |
|---|---|---|---|---|
| +0 :ok: | reexec | 0m 59s | Docker mode activated. | |
| _ Prechecks _ | ||||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | |
| +0 :ok: | codespell | 0m 1s | codespell was not available. | |
| +0 :ok: | detsecrets | 0m 1s | detect-secrets was not available. | |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | |
| _ master Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 4m 37s | master passed | |
| +1 :green_heart: | compile | 1m 1s | master passed | |
| +1 :green_heart: | checkstyle | 0m 30s | master passed | |
| +1 :green_heart: | spotbugs | 0m 58s | master passed | |
| +1 :green_heart: | spotless | 0m 46s | branch has no errors when running spotless:check. | |
| _ Patch Compile Tests _ | ||||
| +1 :green_heart: | mvninstall | 2m 56s | the patch passed | |
| +1 :green_heart: | compile | 0m 42s | the patch passed | |
| +1 :green_heart: | javac | 0m 42s | the patch passed | |
| +1 :green_heart: | blanks | 0m 0s | The patch has no blanks issues. | |
| +1 :green_heart: | checkstyle | 0m 19s | hbase-common: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) | |
| +1 :green_heart: | spotbugs | 0m 51s | the patch passed | |
| +1 :green_heart: | hadoopcheck | 10m 40s | Patch does not cause any errors with Hadoop 3.3.6 3.4.0. | |
| +1 :green_heart: | spotless | 0m 44s | patch has no errors when running spotless:check. | |
| _ Other Tests _ | ||||
| +1 :green_heart: | asflicense | 0m 12s | The patch does not generate ASF License warnings. | |
| 32m 2s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/6381 |
| JIRA Issue | HBASE-28931 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux e14b00b43e80 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / ac5328c69e7207b80d61d0551ef963349266413a |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 84 (vs. ulimit of 30000) |
| modules | C: hbase-common U: hbase-common |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/console |
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
Sorry for coming back too late here. Quick question: why have you replaced the WatchService implementation instead of introducing the polling mechanism as a new and configurable option?
No worries. My preference is for less code that Just Works for all use cases. My hope is that this PR will accomplish that. I don't think there is value in keeping a second implementation if there's not a clear reason to keep it.
Sorry for coming back too late here. Quick question: why have you replaced the WatchService implementation instead of introducing the polling mechanism as a new and configurable option?
No worries. My preference is for less code that Just Works for all use cases. My hope is that this PR will accomplish that. I don't think there is value in keeping a second implementation if there's not a clear reason to keep it.
I agree with that, but isn't WatchService more efficient?
It's based on inotify in Linux and wakes up the thread when event comes, instead of polling the FS every minute.
It's true, WatchService uses a little less CPU time. That's a trade-off.