hbase icon indicating copy to clipboard operation
hbase copied to clipboard

HBASE-28931: Reload SSL context regardless of changed file path

Open charlesconnell opened this issue 1 year ago • 3 comments

At my company we have an issue with our HBase servers not reloading TLS certificate files after they change on disk. We run our HMasters inside Kubernetes Pods, and define our certificate contents as Kubernetes Secrets. Then, the Secrets are projected into the HMaster containers as files. When the value of a Secret changes, the file changes automatically. However, Kubernetes does some complicated indirection, and does not change the files directly. It swaps a new directory in with new files in it.

HBase sets up a WatchService on the directory containing the TLS cert. For example, at my company, the cert is at /etc/hadoop/conf/ssl/cert/server-chain.pem. Then, events from that WatchService are delivered to a handler method which contains this check:

Path eventFilePath = dirPath.resolve((Path) event.context());
if (filePath.equals(eventFilePath)) {
 shouldResetContext = true;
}

Debug logs show why this conditional is never true:

2024-10-21T17:48:13,659 [FileChangeWatcher-server-chain.pem] DEBUG org.apache.hadoop.hbase.io.FileChangeWatcher: Got file changed event: ENTRY_CREATE with context: ..2024_10_21_17_48_13.2471317370
2024-10-21T17:48:13,659 [FileChangeWatcher-server-chain.pem] DEBUG org.apache.hadoop.hbase.io.FileChangeWatcher: Got file changed event: ENTRY_CREATE with context: ..2024_10_21_17_48_13.2471317370
2024-10-21T17:48:13,660 [FileChangeWatcher-server-chain.pem] DEBUG org.apache.hadoop.hbase.io.crypto.tls.X509Util: Ignoring watch event and keeping previous default SSL context. Event kind: ENTRY_CREATE with context: ..2024_10_21_17_48_13.2471317370
.... 

The watcher events have a variety of files attached to them, but none of them are server-chain.pem, so HBase thinks they are not relevant.

I propose that we simply remove the condition inspecting the file name that was changed, and always reload the SSL context if a watcher event fires. This may lead to unnecessary reloads, but that will be harmless.

cc @bbeaudreault

charlesconnell avatar Oct 21 '24 20:10 charlesconnell

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 47s Docker mode activated.
-0 :warning: yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 :green_heart: mvninstall 4m 13s master passed
+1 :green_heart: compile 0m 24s master passed
+1 :green_heart: javadoc 0m 24s master passed
+1 :green_heart: shadedjars 6m 0s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 3m 23s the patch passed
+1 :green_heart: compile 0m 23s the patch passed
+1 :green_heart: javac 0m 23s the patch passed
+1 :green_heart: javadoc 0m 19s the patch passed
+1 :green_heart: shadedjars 6m 38s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 :green_heart: unit 2m 32s hbase-common in the patch passed.
26m 19s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR https://github.com/apache/hbase/pull/6381
JIRA Issue HBASE-28931
Optional Tests javac javadoc unit compile shadedjars
uname Linux e1e024f342bc 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 976a2bed4cff5bf0c8d9a8d9afd129c545f8c0c9
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/testReport/
Max. process+thread count 410 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase avatar Oct 21 '24 21:10 Apache-HBase

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 :green_heart: mvninstall 4m 57s master passed
+1 :green_heart: compile 1m 3s master passed
+1 :green_heart: checkstyle 0m 21s master passed
+1 :green_heart: spotbugs 0m 54s master passed
+1 :green_heart: spotless 1m 3s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 4m 36s the patch passed
+1 :green_heart: compile 1m 11s the patch passed
+1 :green_heart: javac 1m 11s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: checkstyle 0m 24s hbase-common: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1)
+1 :green_heart: spotbugs 1m 8s the patch passed
+1 :green_heart: hadoopcheck 17m 3s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 :green_heart: spotless 0m 57s patch has no errors when running spotless:check.
_ Other Tests _
+1 :green_heart: asflicense 0m 12s The patch does not generate ASF License warnings.
44m 0s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR https://github.com/apache/hbase/pull/6381
JIRA Issue HBASE-28931
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 860f3e36b0dc 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 976a2bed4cff5bf0c8d9a8d9afd129c545f8c0c9
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/1/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase avatar Oct 21 '24 21:10 Apache-HBase

I'm not sure if the extra reloads actually would be harmless. I've watched these logs before and noticed many ignored events. We need to better understand the impact of, for example, every connection getting a different ssl context. Because I've seen volumes of ignored events which could effectively cause close to that.

Cc @anmolnar @Apache9

bbeaudreault avatar Oct 22 '24 01:10 bbeaudreault

@charlesconnell

If I remember correctly the Watcher object watches the folder, not the files, so if the folder contains lots of other files too, the impact of unnecessary reload events will be signifact.

Are you sure there's no way to match the events with the files somehow?

anmolnar avatar Nov 05 '24 21:11 anmolnar

Is this the same issue which described in this blog: https://blog.arkey.fr/2019/09/13/watchservice-and-bind-mount/

Either way I like the approach (1) at the end of the article:

  1. Either poll the file regularly

We could implement that as a fallback option if WatchService doesn't work for some reason.

anmolnar avatar Nov 05 '24 21:11 anmolnar

I would say the issue with Kubernetes secrets mounted in containers is a little bit different than the one mentioned in that blog post. However, simply polling the file of interest for changes is definitely a solution in both cases. The current FileChangeWatcher watches a directory, because that's the only behavior that Java's WatchService supports. But FileChangeWatcher is only used in one place, and it's used to effectively watch a single file. So, I've changed FileChangeWatcher to just poll the one file we care about. This simplifies the relevant code and fixes the issue I reported.

charlesconnell avatar Nov 24 '24 14:11 charlesconnell

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 4m 4s Docker mode activated.
-0 :warning: yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 :green_heart: mvninstall 3m 3s master passed
+1 :green_heart: compile 0m 22s master passed
+1 :green_heart: javadoc 0m 19s master passed
+1 :green_heart: shadedjars 5m 23s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 3m 1s the patch passed
+1 :green_heart: compile 0m 20s the patch passed
+1 :green_heart: javac 0m 20s the patch passed
+1 :green_heart: javadoc 0m 16s the patch passed
+1 :green_heart: shadedjars 5m 19s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 :green_heart: unit 2m 29s hbase-common in the patch passed.
25m 50s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR https://github.com/apache/hbase/pull/6381
JIRA Issue HBASE-28931
Optional Tests javac javadoc unit compile shadedjars
uname Linux 4f3adeebb868 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / ac5328c69e7207b80d61d0551ef963349266413a
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/testReport/
Max. process+thread count 417 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase avatar Nov 24 '24 14:11 Apache-HBase

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 59s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 1s codespell was not available.
+0 :ok: detsecrets 0m 1s detect-secrets was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 :green_heart: mvninstall 4m 37s master passed
+1 :green_heart: compile 1m 1s master passed
+1 :green_heart: checkstyle 0m 30s master passed
+1 :green_heart: spotbugs 0m 58s master passed
+1 :green_heart: spotless 0m 46s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 2m 56s the patch passed
+1 :green_heart: compile 0m 42s the patch passed
+1 :green_heart: javac 0m 42s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: checkstyle 0m 19s hbase-common: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1)
+1 :green_heart: spotbugs 0m 51s the patch passed
+1 :green_heart: hadoopcheck 10m 40s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 :green_heart: spotless 0m 44s patch has no errors when running spotless:check.
_ Other Tests _
+1 :green_heart: asflicense 0m 12s The patch does not generate ASF License warnings.
32m 2s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR https://github.com/apache/hbase/pull/6381
JIRA Issue HBASE-28931
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux e14b00b43e80 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / ac5328c69e7207b80d61d0551ef963349266413a
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 84 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6381/2/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase avatar Nov 24 '24 14:11 Apache-HBase

Sorry for coming back too late here. Quick question: why have you replaced the WatchService implementation instead of introducing the polling mechanism as a new and configurable option?

No worries. My preference is for less code that Just Works for all use cases. My hope is that this PR will accomplish that. I don't think there is value in keeping a second implementation if there's not a clear reason to keep it.

charlesconnell avatar Dec 20 '24 18:12 charlesconnell

Sorry for coming back too late here. Quick question: why have you replaced the WatchService implementation instead of introducing the polling mechanism as a new and configurable option?

No worries. My preference is for less code that Just Works for all use cases. My hope is that this PR will accomplish that. I don't think there is value in keeping a second implementation if there's not a clear reason to keep it.

I agree with that, but isn't WatchService more efficient? It's based on inotify in Linux and wakes up the thread when event comes, instead of polling the FS every minute.

anmolnar avatar Dec 20 '24 18:12 anmolnar

It's true, WatchService uses a little less CPU time. That's a trade-off.

charlesconnell avatar Dec 21 '24 01:12 charlesconnell