Add runAs to Subject interface and introduce IdentityAwarePlugin extension point
Description
Companion Security PR: https://github.com/opensearch-project/security/pull/4665
This PR adds new method to the Subject interface called runAs and introduces a new extension point called IdentityAwarePlugin. This new method can be utilized to run a callable block of code in the context of the subject.
This PR also introduces the notion of an IdentityAwarePlugin which contains a single method to initialize the plugin. The IdentityPlugin will get to define which subject is passed to IdentityAwarePlugins on initialization. Practically, the Security Plugin is the IdentityPlugin and the subject passed to IdentityAwarePlugins is a special pluginSystemSubject which can be utilized to perform transport actions in the plugin system context and outside of the authenticated user context.
Subject.runAs() is intended to be an abstraction and replacement on the current pattern of System index interaction in which plugins will utilize the ThreadContext class and stash the context before performing transport actions on system indices.
With the changes in this PR, the NoopIdentityPlugin will assign a NoopPluginSubject to IdentityAwarePlugins. When using NoopPluginSubject.runAs(() -> { ... }) it runs the Callable within a block where the ThreadContext is stashed which is the same behavior plugins use today so its a direct replacement to try (ThreadContext.StoredContext ctx = threadContext.stashContext()) { ... }.
By itself, this PR is not very interesting, but with the changes in this PR it enables the Security plugin (the IdentityPlugin) to define an implementation for PluginSubject. If you look at the companion PR, the Security implementation of PluginSubject injects a pluginUser into the ThreadContext so that we can start using the existing security authz mechanisms to gate way plugins can do within the runAs block.
Related Issues
Related to: https://github.com/opensearch-project/security/issues/4439
Check List
- [X] Functionality includes testing.
- [X] API changes companion pull request created, if applicable.
- [X] Public documentation issue/PR created, if applicable.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
:x: Gradle check result for 444fde76bca22def9ded0b3be82383a7cf2e08bf: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for aaba0633b70fb67c4d5573fd8f8d3133a78251db: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for eec8b96a1b94be99ea9619eb8bc9125b5faed3b6: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for eec8b96a1b94be99ea9619eb8bc9125b5faed3b6: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
@cwperks I have tons of questions regarding this change, it seems the scope of the change is to propagate the caller identity, is that right? If yes, we should:
- make it a first class citizen of the
ThreadContext, not limited to REST actions (support extensions, general thread pool scheduling, ...) - design it as a chain of callers instead, fe Plugin A -> Plugin B -> ... since in many cases plugin could call another plugin or extension
The way it is implemented now makes the ThreadContext mutable since anyone could just call ThreadContext::setExecutionContext any time, anywhere. Also, we have to think about propagating caller chains between plugin and extensions, the ThreadLocal won't work there
:grey_exclamation: Gradle check result for eec8b96a1b94be99ea9619eb8bc9125b5faed3b6: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
Codecov Report
Attention: Patch coverage is 48.97959% with 25 lines in your changes missing coverage. Please review.
Project coverage is 71.92%. Comparing base (
b30df02) to head (681ea8f). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #14630 +/- ##
============================================
- Coverage 71.92% 71.92% -0.01%
- Complexity 63328 63357 +29
============================================
Files 5227 5231 +4
Lines 296497 296526 +29
Branches 42830 42832 +2
============================================
+ Hits 213248 213267 +19
+ Misses 65764 65676 -88
- Partials 17485 17583 +98
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@reta Thanks for the review! I raised this PR to solicit comments just like what you raised.
You are right, that the ultimate goal is to keep track of which plugin is currently on the execution path and to create a generic mechanism to do that. This PR shows the use of the proxy pattern to show how it can keep track for RestHandler, but it doesn't cover other extension points. On my fork I created a proxy for ActionPlugin which shows how other extension points can be wrapped similarly to what is done in this PR.
FYI with this change execution context is immutable and follows the same pattern as the rest of the ThreadContext. If the execution context is already set, trying to override it will throw an IllegalArgumentException similar to ThreadContext.putSingleHeader.
The only instance I am aware of for plugin chaining is Job Scheduler like this example where GeoSpatial uses the JS LockService when creating a datasource via Rest API. What are some other examples of plugin chaining?
Thanks @cwperks
This PR shows the use of the proxy pattern to show how it can keep track for RestHandler, but it doesn't cover other extension points.
I think this is not a point, right? The problem is much wider than that.
FYI with this change execution context is immutable and follows the same pattern as the rest of the ThreadContext. If the execution context is already set, trying to override it will throw an IllegalArgumentException similar to ThreadContext.putSingleHeader.
No, it is not immutable, it could be changed any time, anywhere:
ThreadContext::clearExecutionContext();
ThreadContext::setExecutionContext("whatever I want here");
I think this is not a point, right? The problem is much wider than that.
It is, but RestHandlers are a good area to start to demonstrate this change since that is where most plugin system index interaction takes place. If we can address all extension points at once, that would be my preference but the scope of the change is much larger.
No, it is not immutable, it could be changed any time, anywhere:
My bad, in a previous iteration I had those methods as package-private. Will update this to not expose these to plugins.
@cwperks could you may be start with the issue first to describe what we would like to capture, why and under which circumstances? We could collect possible implementation options there and get feedback. Thank you.
@reta Definitely, I will open an issue in the core repo. This PR is related to this RFC in the security plugin: https://github.com/opensearch-project/security/issues/4439
Basically, I want to have a Stack keep track of which plugins have been delegated to from the core in a particular thread.
:x: Gradle check result for 8a986185cf9c46e1349fd0ad2d7058e27898be56: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
@reta here is the issue I promised: https://github.com/opensearch-project/OpenSearch/issues/14733
:x: Gradle check result for 73835edac35ffad741746d3420b6347291c7c10d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for dc5c5a8ef85e06b76cfc61dab5535272321f287f: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 70c0c800883701a78ab88bddc29b6c81abd72725: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 4d048b74267da1ab9702c254c04c3d4c80ef806e: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for e26ac6ce7f4f35df483ec2c788325d677041248c: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:white_check_mark: Gradle check result for c2e0a645ebe2eeebc2e8b753f9aa884a8fa28040: SUCCESS
:x: Gradle check result for 55fe24c33b3cbd3aa26eee0de903eac39bb5df08: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
@reta FYI in case you are curious, I opened a draft PR in the security repo that demonstrates how the changes in this PR are intended to be used: https://github.com/opensearch-project/security/pull/4570
Let me know your thoughts.
@reta FYI in case you are curious, I opened a draft PR in the security repo that demonstrates how the changes in this PR are intended to be used: opensearch-project/security#4570
Thanks @cwperks , I will try to get back to the issue shortly and share some ideas, have difficulties with time right now, sorry about that
:grey_exclamation: Gradle check result for 1d7df430a657a37b937f6281f73be12deb54e130: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
I would love a solution that is a viable alternative for plugins to have a clean access model even if they don't fix the fundamental access exposed to the thread context and root node client as those are fundamental breaking changes to the plugin APIs.
I think this would need to be addressed to solve the issue completely. How would a plugin be prevented from doing anything if it has access to the threadPool and root node client?
How would a plugin be prevented from doing anything if it has access to the threadPool and root node client?
Why they should have access to threadPool and node client by default? We do have clear means to solve that
:x: Gradle check result for ed7124739f1517c68bb57292c639a8fd2003f4ce: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ed7124739f1517c68bb57292c639a8fd2003f4ce: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 4b27f3a5a081d37d4f299f04bd90e5cbc62fa150: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b031db9306089872bbf984b66e378790a3852061: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?