OpenSearch
OpenSearch copied to clipboard
[RFC] Cloud Native SQL Plugin
Is your feature request related to a problem? Please describe
I want to run the SQL plugin in a cloud native environment, but I am unable to do so without forking the code.
Today, plugin execution is tightly coupled to OpenSearch core. Access to metadata of index, aliases, templates, node roles, and routing is tightly coupled with cluster state. Furthermore, plugins are vended full access to OpenSearch core including direct access to cluster state internals. This make it difficult to evolve plugin APIs to be more cloud native provider friendly, and eventually run plugins in a stateless mode.
Describe the solution you'd like
I would like to propose an interface to plugins for accessing OpenSearch core. The interface should expose the minimal fine grained state management operations and the operations needed to support plugins. This will ensure both cluster and cloud native implementation can support the plugin uniformly.
Then the SQL plugin will migrate to this interface and be enabled to run in cloud native environments.
Related component
Plugins
Describe alternatives you've considered
No response
Additional context
FAQ
How would the community benefit from this?
- The work here is foundational work which is applicable to other plugins. The plugin interface can be expanded upon to enable other plugins them to become cloud native.
- This proposal aids in decoupling plugins from OpenSearch Core by restricting their core access through an abstract interface. Having this interface enables: 2.a. The possibility for alternative implementations of core APIs. 2.b. Improve stability guarantees of the plugin APIs. 2.c. Complete audibility of plugin’s access to core APIs via this interface. 2.d. The enabling of alternative runtime environments to run plugins unmodified.
What other RFCs does this align with?
- https://github.com/opensearch-project/OpenSearch/issues/13197 1.a. This work adds upon the RFC proposed for cluster state management. Similar to how cluster state management is being refactored to enable cloud native environments, so will plugin to core access.
- https://github.com/opensearch-project/OpenSearch/issues/8110 2.a. In addition to code separation, a standard for plugin to core interaction is now defined to make core module implementations swappable.
- https://github.com/opensearch-project/security/issues/2860 3.a. In a future where all plugin to core access is done via the interface. Security negotiation can be mediated by the interface and eventually lead to the deprecation of the mutable thread context for security management.
I like this proposal because it's effectively identical to https://github.com/opensearch-project/opensearch-sdk-java except that you are offering to refactor core vs. exposing a new interface on top of core. Generally, refactoring core is a costly undertaking, everything takes 10x more work. We're seeing that even relatively small changes to enable things like #10684 require weeks of iterating, just to get gradle checks to pass. You also will need to deal with things like settings, therefore the proposed plugin interface should include that (and possibly other things).
Wouldn't it be easier to make https://github.com/opensearch-project/opensearch-sdk-java the plugin interface you're proposing and implementing the SQL plugin on top of that? If you shortcut/remove/bypass all the remote aspects of that SDK, you effectively have a clean plugin interface, and you can iterate on that completely independently from core.
@dblock's comments about SDK prompted me to think of the fact that we really wanted to implement those using OpenSearch Java Client. That continues on to the whole conversation around generated code and reference specifications.
Which brings me to the question: do we need to write these interfaces or can we generate them from the OpenAPI spec?
Which brings me to the question: do we need to write these interfaces or can we generate them from the OpenAPI spec?
I think we can. It's a ton of boilerplate. That project has moved ahead quite well!
https://github.com/opensearch-project/opensearch-api-specification
@Xtansia is working on a java generator from API, it could generate anything
Hey folks! Thanks for the comments!
To summarize my understanding:
@dblock's feedback is that we should use the opensearch-sdk-java as our target for a cloud native sql plugin. This helps in two ways:
- It avoids needing to refactor core allowing for faster iteration time
- We would end up defining an interface which is already similar to the SDK
@dbwiddis' feedback is that we should generate the core api interface using the API spec. This helps to avoid manually maintaining the java code for interfacing with core.
I've taken a look at the opensearch-sdk-java and I do believe this would offer us a path forward if we invest a bit more into it. There are two main challenges which I see:
-
Today there's no in-process version of Extensions. I'd like to run the code in the same process to avoid the performance overhead of networking. We would need to prioritize this issue https://github.com/opensearch-project/opensearch-sdk-java/issues/688
-
The SDKClient which the opensearch-sdk-java exposes to Extensions is a concrete class which leaves little flexibility for myself to inject custom cloud native behavior. An interface would allow me that flexibility. Similar to the client generation, we could also generate this interface and completely rely on the spec for its definition.
In a similar vein, a solution which I have considered is migrating the SQL plugin to the Client interface rather than the concrete NodeClient. This would enable me to inject custom logic to my Client subclass to support the plugin.
I see this as a similar approach to the opensearch-sdk-java as both solutions are API based and would allow for functionality being swapped at an API action level.
Given that this Client is already in core and exposed to plugins, I would not expect additional core refactoring to enable this solution (except for below).
@dblock @saratvemulapalli I want to understand another point: I see in the opensearch-sdk-java that the Client interface is intended for deprecation. Could you help me understand why it's being considered for deprecation? What are its downsides? I assume the replacement would be a generated interface/client?
To add to the API discussion. I've been having discussions with @shwetathareja on decomposing cluster state access in core, and we feel that the existing public API's are not sufficiently fine-grained for accessing cluster state components.
To be compatible with her RFC here: https://github.com/opensearch-project/OpenSearch/issues/13197 we would need to introduce or extend the existing cluster state API so that we may access the following (not an exhaustive list, this is just for SQL):
- Individual indices metadata
- Individual index settings
- Individual cluster setting values
This is fine grained access is particularly a problem for cluster state because
- Cluster state can be large. Thus, accessing these APIs, even locally, can have a performance overhead
- The full cluster state may not be supported in cloud native environments (eg: routing table may not be supported)
These APIs changes would be applicable to both SDK and Client based approaches.
Looking forward to hearing your thoughts!
Thanks for keeping an open mind @fddattal!
In picking similar-ish solutions in terms of complexity, risk and cost I would always take the one that's viable most long term.
Today there's no in-process version of Extensions.
Correct. There's probably a very minimal solution that assumes that both sides are the same version of the message being sent and passes it in-memory rather than through serialization/deserialization.
@dblock @saratvemulapalli I want to understand another point: I see in the opensearch-sdk-java that the Client interface is intended for deprecation. Could you help me understand why it's being considered for deprecation? What are its downsides? I assume the replacement would be a generated interface/client?
I think you are looking for https://github.com/opensearch-project/opensearch-sdk-java/issues/732 that's all @dbwiddis.
Hey folks! I've taken a look at https://github.com/opensearch-project/OpenSearch/issues/13336 and I think that there's a shared solution that would position us well in the long term for a full extension migration.
Proposal
What I am proposing is the following:
- We introduce a new client interface which plugins and extensions will use to interact with core.
- The interface itself can be maintained as a separate library or module.
- Data exchange over the interface is done using simple Java Pojo's.
- The interface is asynchronous.
- The library would maintain all the Pojo's for its Request/Response objects.
- OpenSearch and cloud native environments may provide alternative implementations of this interface to suit their needs.
Tenets
- The API definition should not depend on core
- The API should function using message passing "as-if" executing remotely
- Internal communication details are abstracted by the interface
Interface Definition
package org.opensearch.sdk;
import org.opensearch.sdk.model.ClearScrollRequest;
import org.opensearch.sdk.model.ClearScrollResponse;
import org.opensearch.sdk.model.DeleteCustomRequest;
import org.opensearch.sdk.model.DeleteCustomResponse;
import org.opensearch.sdk.model.GetCustomRequest;
import org.opensearch.sdk.model.GetCustomResponse;
import org.opensearch.sdk.model.GetIndexMappingsRequest;
import org.opensearch.sdk.model.GetIndexMappingsResponse;
import org.opensearch.sdk.model.GetIndexSettingRequest;
import org.opensearch.sdk.model.GetIndexSettingResponse;
import org.opensearch.sdk.model.GetSystemSettingRequest;
import org.opensearch.sdk.model.GetSystemSettingResponse;
import org.opensearch.sdk.model.MultiSearchRequest;
import org.opensearch.sdk.model.MultiSearchResponse;
import org.opensearch.sdk.model.PutCustomRequest;
import org.opensearch.sdk.model.PutCustomResponse;
import org.opensearch.sdk.model.ResolveIndicesAndAliasesRequest;
import org.opensearch.sdk.model.ResolveIndicesAndAliasesResponse;
import org.opensearch.sdk.model.ScrollRequest;
import org.opensearch.sdk.model.SearchRequest;
import org.opensearch.sdk.model.SearchResponse;
import java.util.concurrent.CompletionStage;
public interface Client {
// for sql plugin
CompletionStage<SearchResponse> search(SearchRequest request);
CompletionStage<SearchResponse> scroll(ScrollRequest request);
CompletionStage<ClearScrollResponse> clearScroll(ClearScrollRequest request);
CompletionStage<GetIndexMappingsResponse> getIndexMappings(GetIndexMappingsRequest request);
CompletionStage<GetIndexSettingResponse> getIndexSetting(GetIndexSettingRequest request);
CompletionStage<GetSystemSettingResponse> getSystemSetting(GetSystemSettingRequest request);
CompletionStage<ResolveIndicesAndAliasesResponse> resolveIndicesAndAliases(ResolveIndicesAndAliasesRequest request);
CompletionStage<MultiSearchResponse> multiSearch(MultiSearchRequest request);
// for data store interface
CompletionStage<PutCustomResponse> putCustom(PutCustomRequest request);
CompletionStage<GetCustomResponse> getCustom(GetCustomRequest request);
CompletionStage<DeleteCustomResponse> deleteCustom(DeleteCustomRequest request);
}
Pros / Cons
Pros:
- This proposal gives us the flexibility for building a solution that would work for OpenSearch and cloud native envrionments
- It is aligned with efforts to decompose cluster state - https://github.com/opensearch-project/OpenSearch/issues/13197
- The interface can be maintained outside of the core allowing us to develop it independently
Cons:
- This will introduce a competing standard and require plugin migration
- Additional effort would be needed to enhance the interface to expose new core capabilities.
- Albiet this can be seen as a good thing because it would be a forcing function for API evolution stewardship
- The implementers of the interface (core, cloud native) would need to perform object transformation to convert from the API Pojo to their internal core representations
What's Not Addressed Here
- Versioning - The API itself is not currently versioned, but the API could be extended to add versioning at either the client or api action level.
- Cross-plugin/module calls - Currently there's no way for a plugin or module to expose api actions for other plugins or modules to consume. We could expose a mechanism to register api action handlers with the client and execute those in a way similar to
org.opensearch.Client#execute(ActionType, Request, ActionListener).
@fddattal I like this. Do you think it makes sense to reuse https://github.com/opensearch-project/opensearch-sdk-java for that and possibly produce multiple artifacts (that eventually merge)?
@dblock I like that approach and it's certainly feasible to do so. We would modify the SDK package to be a multi-module gradle project, and the existing extension code would become one module, while the API would be another. Each releasing their own artifacts.
I've put together a POC for this approach here:
- https://github.com/fddattal/opensearch-sdk-java
- https://github.com/fddattal/OpenSearch/tree/sdk_in_core_2_12_v2
- https://github.com/fddattal/sql/tree/sql_on_sdk_2_12_0
I was able to:
- Refactor the opensearch-sdk-java and consume it in sql and implement the new api interface in core.
- Refactor enough of the SQL core and implement the SDK Client API to get a simple SQL select working
I found I spent a lot of time creating the model objects and writing the code to convert to/from the model and core data structures. I think having some standard tooling in the API to aide with this would help drive adoption.
Data exchange over the interface is done using simple Java Pojo's.
@fddattal : one clarification, these Pojo are re-using existing OpenSearch cluster state sub objects/ data structures or altogether new Pojo's
one clarification, these Pojo are re-using existing OpenSearch cluster state sub objects/ data structures or altogether new Pojo's
@shwetathareja These Pojos are new altogether and would live outside of core in the SDK package.
Next Steps
Hey folks, I don't see any blocking concerns so I am moving ahead with a full implementation.
Below is what to expect regarding high level changes. Please let me know if you have any concerns!
Investigation Work
- Investigate the level of effort required to fully refactor SQL to only use the SDK Pojo's throughout versus translating to SDK Pojo's at the API layer. Today the plugin is using core's objects which may contain features not supported by the SDK APIs. Depending on how this goes it would inform us on whether we need to add a generic request/response transformation layer in the SDK API now versus simply have the sql plugin migrate. A generic request transformation layer would be needed later to enable plugin-to-plugin interactions via the
org.opensearch.sdk.Client.
OpenSearch Java SDK
- Package will be refactored as a gradle multi-module project
- New module
opensearch-java-sdk-apiwill be introduced to contain theo.o.sdk.Clientand associated Request/Response objects - [Contingent on Investigation 1 finding reader/writer interface is needed] Add a token based reader/writer interface similar to the xcontent API in core. All Request/Response objects would support interfacing with these token streams.
SQL Plugin
- SQL plugin will access the the
o.o.sdk.Clientdependency exposed transitively through core. - [Contingent on Investigation 1 finding reader/writer interface is not needed] Refactor all read only paths to use the sdk.Client and Pojo's entirely
- Cloud native compatible SQL integ tests will be renamed from
*IT.javato*CloudNativeIT.java - A new remote cluster test gradle task will be added to run all
*CloudNativeIT.java's
Core
- Core
:serverwill consume theorg.opensearch.sdk:opensearch-java-sdk-apias anapidependency - A new parameter and overloaded method will be added to Plugin.java to recieve a concrete
org.opensearch.sdk.Clientimplementation - Core
:serverwill maintain a private implementation oforg.opensearch.sdk.Client - Updates to the test framework will be made so that cloud native providers can inject infrastructure provisioning steps during integ tests
- Updates to the test framework to allow core to inject custom RestClient implementations
Core :server will consume the org.opensearch.sdk:opensearch-java-sdk-api as an api dependency
@fddattal can you elaborate why :server module would need this dependency?
@shwetathareja These Pojos are new altogether and would live outside of core in the SDK package.
How are you going to generate these Pojo? Using OpenAPI specification? What would be the performance overhead of translating across objects?
Thanks for your comments @shwetathareja !
can you elaborate why :server module would need this dependency?
Core :server needs to provide an implementation of the o.o.sdk.Client interface to plugins. An instance of this class would need to be injected into plugins during their initialization in Node.java.
How are you going to generate these Pojo? Using OpenAPI specification?
For now, I plan to hand write the Java classes by hand and use Lombok to generate the boilerplate. We certainly could use a spec to generate the API in the future as more target runtimes are supported.
What would be the performance overhead of translating across objects?
I've benchmarked (on my laptop) translating a large search request with 10k boolean clauses and a large search response with 10k hits and found the request took 0.03 ms and response took 0.1 ms. If performance overhead due to object translation becomes a problem we could introduce interfaces in the API which map 1:1 with their corresponding core objects and have the core object implement those interfaces. For such implementations, the translation would be a noop.
Benchmark Mode Cnt Score Error Units
SDKTranslatorBenchmark.testTranslateLargeSearchRequest avgt 5 38929.944 ± 456.177 ns/op
SDKTranslatorBenchmark.testTranslateLargeSearchResponse avgt 5 114267.749 ± 10530.752 ns/op
Sorry to chime late. @fddattal since the RFC is around running SQL independently than core. And one of option provided was opensearch-sdk-java. 2 years back when we started working on sdk, SQL was the first plugin I integrated with sdk to run independently. Commit history of my fork for any help around.
@owaiskazi19 whoa I totally forgot about that one! full circle
SQL plugin will access the the o.o.sdk.Client dependency exposed transitively through core.
@fddattal I'm assuming "core" here means the :server module. Correct me if I'm wrong. But isn't the whole point here to break the dependency that the plugin has on the :server module?
Thanks for your comment @andrross !
Correct me if I'm wrong. But isn't the whole point here to break the dependency that the plugin has on the :server module?
Yes it's true that "core" means the :server in this context.
I'd state our goal slightly differently - Refactor the read-only paths in SQL plugin which access core to one unifying cloud native interface. In doing so we would remove deep core access channels such as direct cluster state access and unnecessary core details being exposed to the plugin.
The steps taken here will help us to fully remove the core dependency in the future. However, the work will not be sufficient to do so because both non-read-only paths and accesses from core to plugin will not be refactored.
I'd state our goal slightly differently - Refactor the read-only paths in SQL plugin which access core to one unifying cloud native interface. In doing so we would remove deep core access channels such as direct cluster state access and unnecessary core details being exposed to the plugin.
@fddattal We currently have the org.opensearch.plugins java namespace in the :server module (with the main entry point being org.opensearch.plugins.Plugin), which on the surface looks like a nice clean interface for extending the functionality of OpenSearch. However, the problem, as has been stated many times, is that many very low level details of the server are exposed through this interface resulting in very tight coupling. With opensearch-java-sdk-api and o.o.sdk.Client will you essentially be implementing a "version 2" of this Plugin interface that doesn't expose any :server dependencies? Are there additional complexities here that I'm missing?
Thanks @fddattal for the RFC.
I've worked through extensions[1] which was essentially de-coupled the compute from OpenSearch but still relied on internal implementations of opensearch :server.
I see this proposal as plugins will be de-coupled from internal implementations of core and yet access the information they'd like.
I love the proposal, essentially see it as PluginsV2 (I agree with @andrross).
Here are my thoughts and proposal:
Tenets (in addition to the ones listed by @fddattal)
- Interface abstracts inner implementations (including Pojo's) of core (
:server) - The interface exposed to clients does the heavy lifting for sync and async implementations.
- Core (
:server) would have a default implementation in min distribution.
I'd like this interface to de-couple plugins in the following verticals, and deliver in phases:
- Decouple In Memory Store (a.k.a cluster state): As we move the cluster management to Cloud Native[2], plugins should be agnostic of local vs remote implementations of cluster state, which this RFC already covers and your usecase for SQL.
- Decouple plugin storage: Plugins rely on OpenSearch to store metadata in indices for persistence. This configuration doesn’t have to be in opensearch index. @dbwiddis's RFC[3] proposes an abstraction with alternative cloud native storage.
- Decouple plugin configuration: Plugins rely on OpenSearch settings[4] to expose knobs for users to configure features. Plugins should be agnostic of local vs remote implementations of Cluster settings.
- Decouple plugin security and tenancy: Plugins use OpenSearch security plugin and implement custom authorization mechanisms[5]. I'd like an interface which offloads AuthZ to a central authority. Also tangentially I am seeing use cases of plugins trying to support multi-tenancy[6], which could be solved.
I would want to deliver these verticals in phases and incrementally. Eventually once we have adoption we can explore the options of deprecating existing access's to the core :server
I would like to have the interface defined in OpenSearch repo as a library and default implementation in :server.
Additionally plugins will take a dependency on the library. I do not mind calling it org.opensearch.plugins.Pluginv2 or org.opensearch.sdk.
Few things I'd like to push it down the road:
@dbwiddis' feedback is that we should generate the core api interface using the API spec. This helps to avoid manually maintaining the java code for interfacing with core.
- I would not prioritize API generation until the interface is widely adopted and we are seeing pains of manually maintaining it. With the experience of Plugin interface[6], it really hasn't changed a lot for a long time.
- Extensions are another way of extending features of OpenSearch but for phased and incremental approach I'd like the interface to be flexible enough to support extensions in future.
I would also like to get feedback from @reta @AmiStrn @samuel-oci @rursprung @abseth-amzn who worked in the area of plugins.
[1] https://github.com/opensearch-project/OpenSearch/issues/2447 [2] https://github.com/opensearch-project/OpenSearch/issues/13197 [3] https://github.com/opensearch-project/OpenSearch/issues/13336 [4] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/settings/Settings.java [5] https://github.com/opensearch-project/security/issues/1895 [6] https://github.com/opensearch-project/ml-commons/issues/2358
With
opensearch-java-sdk-apiando.o.sdk.Clientwill you essentially be implementing a "version 2" of this Plugin interface that doesn't expose any:serverdependencies? Are there additional complexities here that I'm missing?
I think that's it. We put it in a different repo (a facade) to reduce blast radius when API/implementation/internals change in core. Instead of having to figure out how to fix N plugins we fix the SDK, which itself then depends on a stable version of core rather than on the next moving version of core.
@saratvemulapalli I think we're aligned and I really like how you've broken much of this work down. A couple points to follow up on though:
I would like to have the interface defined in OpenSearch repo as a library and default implementation in :server
I agree with this. Probably something like a new library in libs/pluginv2 or libs/plugin-api (or whatever we want to name it). I think this creates the groundwork for integration with JPMS where we can explicitly declare what gets publicly exported by these libraries.
Interface abstracts inner implementations (including Pojo's) of core (:server)
Just to clarify, as we build on the vision and work started in #5910 we should be moving many parts of the code in :server into a respective library (like libs/common and libs/core) and that would be fair game for the new plugin API to take a dependency (provided the library is exporting the class as a public API). This would avoid duplicating and translating POJOs for things like the Java object representation of the query DSL. Is this in line with your thinking as well?
@andrross @saratvemulapalli @dblock I think the plugin model was pushed to its limits and dealing with those was the primarily reason behind introducing the extensions as a replacement (and the path forward). I think revamping / extending the plugin APIs (org.opensearch.plugins.Pluginv2) is the step in reverse direction, that would make the extensibility even more difficult to evolve and maintain.
Regarding in-proc / out-proc extensions, I think we could stick to out-proc model and explore the UNIX domain sockets [1] , which is very efficient way to communicate with processes on the same host - it should eliminate the additional latency concerns at large.
[1] https://openjdk.org/jeps/380
I agree with this. Probably something like a new library in
libs/pluginv2orlibs/plugin-api(or whatever we want to name it). I think this creates the groundwork for integration with JPMS where we can explicitly declare what gets publicly exported by these libraries.
I personally dislike the option of putting this into this repo a lot because of ho difficult it is to iterate in core, and because the dependency would not be against a stable, previously released version of OpenSearch, but against all the same internals.
I think revamping / extending the plugin APIs (
org.opensearch.plugins.Pluginv2) is the step in reverse direction, that would make the extensibility even more difficult to evolve and maintain.
Extensions = out-of-proc plugins with a well defined SDK Plugins v2 = in-proc plugins with a well defined SDK
We're basically saying let's do the well defined SDK part and worry about in-proc vs. out-of-proc later as just a runtime feature. Seems like a good shortcut to change the dependency from N plugins -> core to N plugins -> 1 sdk -> core without rewriting everything, no?
Seems like a good shortcut to change the dependency from N plugins -> core to N plugins -> 1 sdk -> core without rewriting everything, no?
-1 to be fair, with the means we have at our disposal, I think we should:
- not do Plugins v2 (I understand the API boundary / cleanup but this is why we have extensions now)
- not focus on in-proc extensions (we need very compelling reasons to dismiss IPC mechanisms JVM/Java supports to justify work on that)
This is just my opinions, thanks @dblock !
@reta First, as always, thank you. I really appreciate, value, and respect your strong opinions.
One of the things @fddattal says above is "I want to run the SQL plugin in a cloud native environment, but I am unable to do so without forking the code.". AFAIK he's tying to remote/swap some parts of OpenSearch, such as access to metadata. Practically speaking the SQL plugin is tightly coupled with core, so if you wanted to cleanly do that you would have to fork. The alternative you propose is to rewrite SQL on top of extensions, but that also means figuring out how to host and run extensions, implementing security, and so much more. It's a pretty tall order, so adding plugin interfaces to the SDK that look similar to extension interfaces seems like at least a reasonable step forward, no?
It's a pretty tall order, so adding plugin interfaces to the SDK that look similar to extension interfaces seems like at least a reasonable step forward, no?
Thanks @dblock , enhancing existing plugin APIs is fine (it is not != Plugins v2), we do this and it makes sense.
The alternative you propose is to rewrite SQL on top of extensions, but that also means figuring out how to host and run extensions, implementing security, and so much more
Changing the extensibility model (for getting the benefits) requires efforts - it has to be done or extensions will never take off (I believe).
SQL "plugin" ideally should be an extension of the query processing in the core. It happened to be plugin implementation and ideally moving forward we should not enforce an additional layer for out of proc for SQL as an example. Each plugin needs to be evaluated and categorized as core, application (auxiliary) or UX (dashboards) plugin (feature). I don't think we should map each plugin to one extensible interface - one size fits all approach. Ideally we should have an extensibility model for each of these categories. Can we approach this more pragmatically and understand the needs of SQL /core plugins?