OpenSearch
OpenSearch copied to clipboard
Added new baseclasses in opensearch for ccr classloader issue
Description
(Apologies in advance, for a long description ahead ;) )
Background:
-
This change is related to a new feature being added in ism - unfollow-action #726 - integrating ccr and ism plugins.
-
It requires ism plugin to invoke stop-replication action from ccr. As both ccr and ism need to invoke some common code, the common code could be moved to common-utils, as this has been done previously for other plugins too. For ex. notifications plugin.
-
However, sharing of libraries also leads to a type-cast and class-loading issue - previously seen with notification plugin. Some highlights from that PR's description -> OpenSearch loads each plugin by different class loader separately. When two plugins need to communicate by transport action, itβs common to put request and response class into a common module shared by both. However, as per the forum post, this may cause problem because OpenSearch does "optimization to avoid serialization" for local request between plugins on same JVM.
-
As suggested by @bowenlan-amzn, this means as followed for notification and other plugins, the request object needs to be of a higher-level class from opensearch-core like ActionRequest and later be recreated into required StopIndexReplicationRequest type. To achieve this, TransportStopIndexReplicationAction's doExecute method needs to be overridden, to accept a request of different type (a class from opensearch-core) instead of StopIndexReplicationRequest.
Issue:
- The method doExecute in TransportStopIndexReplicationAction is not overridable with change in type of request parameter.
- Reason: TransportStopIndexReplicationAction extends class TransportClusterManagerNodeAction from opensearch-core - which mandates the request to be of a class that satisfies this condition
extends ClusterManagerNodeRequest<Request>
No class in opensearch-core meets this condition. So we cannot override the method with any existing class from the opensearch-core.
Change description: Because the generics constraints are very strict for ClusterManagerNodeRequest and TransportClusterManagerNodeAction, I propose to introduce two new classes in Opensearch, similar to ClusterManagerNodeRequest and TransportClusterManagerNodeAction, but primarily change only in the generics. This can be used in future, by other plugins too, that use TransportClusterManagerNodeAction and need to use a different request type.
Proposed solution:
- Opensearch: Add new baseclasses in Opensearch-core - with a more flexible generic type.
- Common-utils: Move common code of stop-replication from ccr to common-utils
- CCR: Modify ccr plugin to consume stop-replication libs from common-utils. Also, modify TransportStopIndexReplicationAction to inherit new baseclasses built in opensearch-core in point 1.
- ISM: Add new action in ism
This is an initial PR catering to point 1. I've not yet worked on the test coverage part as I would request reviewers to give feedback on this approach, so we can proceed accordingly.
Related Issues
It is a pre-req required for https://github.com/opensearch-project/index-management/issues/726.
Check List
- [ ] New functionality includes testing.
- [ ] All tests pass
- [ ] New functionality has been documented.
- [ ] New functionality has javadoc added
- [ ] Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
- [x] Commits are signed per the DCO using --signoff
- [x] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
- ~[ ] Public documentation issue/PR created~
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
:x: Gradle check result for 4aeb5fab7978f8bf3652d842ee758d01ba15206b: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Hi @dblock and @bowenlan-amzn, This is a follow-up to our discussion on chat. Request you to please have a look at this and share feedback.
:x: Gradle check result for 71852ddacd3e359bc6d9541986ad96f2e36117df: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
First time saw such error
* What went wrong:
Execution failed for task ':server:filepermissions'.
> Found invalid file permissions:
Source file is executable: /home/runner/work/OpenSearch/OpenSearch/server/src/main/java/org/opensearch/action/support/clustermanager/ClusterManagerNodeAckRequest.java
Source file is executable: /home/runner/work/OpenSearch/OpenSearch/server/src/main/java/org/opensearch/action/support/clustermanager/TransportClusterManagerNodeExtendedAction.java
You want to change the file permission of these 2 files and push up
β clustermanager git:(ccr-classloader) ls -alh
total 168
drwxr-xr-x@ 13 bowenlan staff 416B May 9 15:51 .
drwxr-xr-x@ 34 bowenlan staff 1.1K May 7 15:30 ..
-rwxr-xr-x@ 1 bowenlan staff 2.8K May 9 15:51 ClusterManagerNodeAckRequest.java <--
-rw-r--r--@ 1 bowenlan staff 3.4K Jan 3 16:05 ClusterManagerNodeOperationRequestBuilder.java
-rw-r--r--@ 1 bowenlan staff 2.2K Jan 3 16:05 ClusterManagerNodeReadOperationRequestBuilder.java
-rw-r--r--@ 1 bowenlan staff 2.4K Jan 3 16:05 ClusterManagerNodeReadRequest.java
-rw-r--r-- 1 bowenlan staff 4.4K May 7 15:30 ClusterManagerNodeRequest.java
-rw-r--r-- 1 bowenlan staff 22K May 7 15:30 TransportClusterManagerNodeAction.java
-rwxr-xr-x@ 1 bowenlan staff 21K May 9 15:51 TransportClusterManagerNodeExtendedAction.java <--
-rw-r--r-- 1 bowenlan staff 4.1K May 7 15:30 TransportClusterManagerNodeReadAction.java
drwxr-xr-x@ 6 bowenlan staff 192B Jan 3 16:05 info
-rw-r--r--@ 1 bowenlan staff 307B May 19 2023 package-info.java
drwxr-xr-x 7 bowenlan staff 224B May 7 15:30 term
:x: Gradle check result for 48738a6127d71ed476e08477cab048b9da20449f: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for e35b20b2faa6e3cba9ada6ebd846546749f30104: UNSTABLE
- TEST FAILURES:
1 org.opensearch.test.rest.ClientYamlTestSuiteIT.test {p0=search.aggregation/20_terms/string profiler via global ordinals}
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
Codecov Report
Attention: Patch coverage is 0%
with 162 lines
in your changes missing coverage. Please review.
Project coverage is 71.55%. Comparing base (
b15cb0c
) to head (e35b20b
). Report is 411 commits behind head on main.
:exclamation: Current head e35b20b differs from pull request most recent head 1bf2e04
Please upload reports for the commit 1bf2e04 to get more accurate results.
Files | Patch % | Lines |
---|---|---|
...ger/TransportClusterManagerNodeExtendedAction.java | 0.00% | 142 Missing :warning: |
...t/clustermanager/ClusterManagerNodeAckRequest.java | 0.00% | 20 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #13615 +/- ##
============================================
+ Coverage 71.42% 71.55% +0.13%
- Complexity 59978 61145 +1167
============================================
Files 4985 5054 +69
Lines 282275 287301 +5026
Branches 40946 41619 +673
============================================
+ Hits 201603 205582 +3979
- Misses 63999 64748 +749
- Partials 16673 16971 +298
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:x: Gradle check result for 1bf2e04eb71a81b252c65b92faddad8633f62570: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
This PR is stalled because it has been open for 30 days with no activity.
This PR is stalled because it has been open for 30 days with no activity.