OpenSearch
OpenSearch copied to clipboard
add request labeling service/framework
Description
This PR provides the foundational support to add multitenant labels to ThreadContext. This foundation provides a way to implement the LabelingPlugin to define how to compute these labels using implicit/explicit attributes.
Related Issues
Resolves #13341
Check List
- [X] Functionality includes testing.
- [ ] API changes companion pull request created, if applicable.
- [ ] Public documentation issue/PR created, if applicable.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
:x: Gradle check result for 4c3bdd7c69fed70cfa807b41ecda405301cf813a: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
@msfroh @sohami @jainankitk @reta Can you look into this PR ?
@msfroh @sohami @jainankitk @reta
@kaushalmahi12 @ansjcy I honestly don't understand the design decisions of this change:
- we talked about a possibility to associate tags (key=value pairs) with any REST request,
ActionPlugin::getRestHandlerWrapperdoes that - we talked about a possibility to submit tags (key=value pairs) with any REST request,
ActionPlugin::getRestHeadersdoes that - we talked about a possibility to propagate the the tags (key=value pairs) down the call chain and node boundaries,
ThreadContextStatePropagatordoes that
We have all the pieces already (I believe), they just have to be linked together.
@reta yes, we can utilize ActionPlugin to associate tags with a request body/header, which can solve the first approach (Let the client do it) in @msfroh 's RFC (https://github.com/opensearch-project/OpenSearch/issues/13341). For this PR and also this one (https://github.com/opensearch-project/OpenSearch/pull/14282) we are trying to explore potential solutions to add tags based on rules, which solve the second approach (Rule-based labeling).
So, there can be two different approaches to do rule-based labeling, one is let the plugins expose their rules and the core service apply the rules for each request if possible (https://github.com/opensearch-project/OpenSearch/pull/14282). Another approach is using plugin interface to make it more generic, especially for Query Grouping features use cases (@kaushalmahi12 can explain more about the advantages here).
So, there can be two different approaches to do rule-based labeling, one is let the plugins expose their rules and the core service apply the rules for each request if possible (#14282). Another approach is using plugin interface to make it more generic, especially for Query Grouping features use cases (@kaushalmahi12 can explain more about the advantages here).
@ansjcy my point is - there are all the tools in the disposable, there is not need to build framework or alike in core (in my opinion), it is sufficient to have separate plugin (labeling-plugin or tagging-plugin, you name it) that would do labels based on rules or other criteria.
@reta Thanks for providing detailed response. I agree that existing constructs might be enough for majority of the use cases but it will not suffice for something which want to classify/tag requests at co-ordinator and data node level.
For QueryGrouping (Query Sandboxing previously) we need to label the request at both levels for resource tracking and enforcement purposes.
I see that the ActionPlugin::getRestHandlerWrapper might do the job for co-ordinator requests but Based on the javadoc of this method only one installed plugin can implement this.
/**
* Returns a function used to wrap each rest request before handling the request.
* The returned {@link UnaryOperator} is called for every incoming rest request and receives
* the original rest handler as it's input. This allows adding arbitrary functionality around
* rest request handlers to do for instance logging or authentication.
* A simple example of how to only allow GET request is here:
* <pre>
* {@code
* UnaryOperator<RestHandler> getRestHandlerWrapper(ThreadContext threadContext) {
* return originalHandler -> (RestHandler) (request, channel, client) -> {
* if (request.method() != Method.GET) {
* throw new IllegalStateException("only GET requests are allowed");
* }
* originalHandler.handleRequest(request, channel, client);
* };
* }
* }
* </pre>
*
* Note: Only one installed plugin may implement a rest wrapper.
*/
default UnaryOperator<RestHandler> getRestHandlerWrapper(ThreadContext threadContext) {
return null;
}
The implementation here should ideally perform authN/authZ I think, ref: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/OpenSearchSecurityPlugin.java#L666.
@reta Thanks for providing detailed response. I agree that existing constructs might be enough for majority of the use cases but it will not suffice for something which want to classify/tag requests at co-ordinator and data node level.
@kaushalmahi12 if I am not mistaken, the headers are propagated from REST (coordinator) to transport layer and we support that, what is missing here to classify/tag requests at co-ordinator and data node level?
I see that the ActionPlugin::getRestHandlerWrapper might do the job for co-ordinator requests but Based on the javadoc of this method only one installed plugin can implement this.
I completely missed that part, thank you for correcting me, the ActionFilter should be the right extension point I believe (same ActionPlugin), filters have no such constraints.
default List<ActionFilter> getActionFilters() {
@kaushalmahi12 - It seems we are doing too much in single PR which is probably adding to the confusion. Ideally there should be two separate PRs, one adding the support for labeling extensions within core, second one demonstrating the usage of LabelingService from QueryGroupPlugin
@reta Yes! I was also thinking of using ActionFilter first but since ActionFilter is used to allow/disallow transport actions.
Do you think this would be the right construct to leverage this use case ?
In some use cases we might also need the proper ordering for the execution of these ActionFilters which could be imposed using the int order() method.
IMO This approach provides more flexibility on how and when a plugin can label/tag the requests.
@reta Yes! I was also thinking of using
ActionFilterfirst but sinceActionFilteris used to allow/disallow transport actions. Do you think this would be the right construct to leverage this use case ?
I think so, it looks right for a job
In some use cases we might also need the proper ordering for the execution of these ActionFilters which could be imposed using the int order() method.
We do not provide any ordering guarantees and we could not fulfill this requirement fairly. Why do you need ordering?
@reta We will need the ordering because we want the authN/authZ information present in the context. That is why we want this logic to be executed after the authN/authZ filter.
Now that I remember the ActionFilter might not do the job since we wouldn't have access to the ThreadPool/ThreadContext in this apply method to assign labels for QueryGrouping use case.
@reta We will need the ordering because we want the authN/authZ information present in the context
@kaushalmahi12 the use case does not feel right, we have security plugin in place to handle such cases and it uses the proper abstraction
@reta We will need the ordering because we want the authN/authZ information present in the context.
@kaushalmahi12 - I don't think we should enforce the ordering assumption between plugins.
Now that I remember the ActionFilter might not do the job since we wouldn't have access to the ThreadPool/ThreadContext in this apply method to assign labels for QueryGrouping use case.
This is a limitation of ActionFilter which makes it difficult to leverage for adding label to search request.
@kaushalmahi12 the use case does not feel right, we have security plugin in place to handle such cases and it uses the proper abstraction
@reta - Labeling logic should not be part of security plugin IMO, and others plugins might not have access to the security context. Do you have any suggestions other than building this in core?
Now that I remember the
ActionFiltermight not do the job since we wouldn't have access to theThreadPool/ThreadContextin this apply method to assign labels forQueryGroupinguse case.
@kaushalmahi12 as per my understanding, we enrich request with the headers (tags) which would be propagated to thread context (using ThreadContextStatePropagator).
@reta - Labeling logic should not be part of security plugin IMO, and others plugins might not have access to the security context. Do you have any suggestions other than building this in core?
@jainankitk I don't understand the dependency on authZ/authN and need for ordering here: we run action filters one by one so the end result should be the (thread) context populated with all necessary data before the request is processed on coordinator.
@jainankitk I don't understand the dependency on authZ/authN and need for ordering here:
@reta - The labeling logic will be based on the security context. For example - if the principal is Joe, the tenant is A. If the principal is Dave, tenant is B.
we run action filters one by one so the end result should be the (thread) context populated with all necessary data before the request is processed on coordinator.
As mentioned above, the labeling action filter will need the principal information provided by the security plugin for populating the tenant header.
@reta - The labeling logic will be based on the security context. For example - if the principal is Joe, the tenant is A. If the principal is Dave, tenant is B.
@jainankitk here is my understanding of the flow:
- the
securityplugin is using thegetRestHandlerWrapperand will be run before filters (if I am not mistaken), the principal / tenant will be present in the context if available - the
securityplugin could be removed from the cluster so the principal / tenant will not present in the context
If I am not mistaken, the security plugin uses thread context to share principal information, so the labeling / tagging plugin could use it.
This PR is stalled because it has been open for 30 days with no activity.