cross-cluster-replication icon indicating copy to clipboard operation
cross-cluster-replication copied to clipboard

Adding provision to invoke stop replication from other plugins

Open aggarwalShivani opened this issue 1 year ago • 13 comments

Description

This is related to the feature unfollow-action #726 and is the 2nd PR in continuation of the proposed solution mentioned in common-utils-PR. Detailed information is explained there.

Change description:

  1. Removing the classes StopIndexReplicationAction and StopIndexReplicationRequest from the ccr project and import StopIndexReplicationRequest and ReplicationActions from common-utils instead.
  2. Creating a new transport action, TransportUnfollowIndexReplicationAction - of type HandledTransportAction. This reads the request of type ActionRequest, transforms it to type StopIndexReplicationRequest, and invokes the TransportStopIndexReplicationAction.execute() on it.
  3. Adding a new ActionHandler to invoke TransportUnfollowIndexReplicationAction (for inter-plugin communication) , in addition to the one for TransportStopIndexReplicationAction (for REST API).

Issues Resolved

Related Issues unfollow-action #726

Check List

  • [ ] New functionality includes testing.
    • [x] All tests pass
  • [ ] New functionality has been documented.
    • [ ] New functionality has javadoc added
  • [x] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

aggarwalShivani avatar Jun 05 '24 05:06 aggarwalShivani

Also tagging @saikaranam-amazon for review

ankitkala avatar Jul 08 '24 15:07 ankitkala

Here's the background of making these changes to enable calling action from another plugin https://github.com/opensearch-project/notifications/pull/223

bowenlan-amzn avatar Jul 08 '24 16:07 bowenlan-amzn

I've some concerns around moving the classes out of cross cluster replication package. Ideally either of these 2 options should solve the problem: Option 1: Move the ISM action into ccr plugin. From what i understand, ISM actions can be implemented as an plugin. CCR can take a dependency on ISM and implement the stop action in this repo. Option 2: ISM plugin can take a dependency on CCR and reuse the classes. (We do same for security plugin.)

ankitkala avatar Jul 15 '24 14:07 ankitkala

@Ankit Kala commented on Jul 15, 2024, 7:06 AM PDT:

I've some concerns around moving the classes out of cross cluster replication package.
Ideally either of these 2 options should solve the problem:
Option 1: Move the ISM action into ccr plugin. From what i understand, ISM actions can be implemented as an plugin. CCR can take a dependency on ISM and implement the stop action in this repo.
Option 2: ISM plugin can take a dependency on CCR and reuse the classes. (We do same for security plugin.)

We didn't try Option 1 and it looks promising, and we may not need a proxy transport class then since ISM base doesn't have to know anything about CCR.

@aggarwalShivani some quick notes for trying it out ISM has this interface, https://github.com/opensearch-project/index-management/blob/dbd2bc25f5e338819237af5a6b35c6c1e35d7171/spi/src/main/kotlin/org.opensearch.indexmanagement.spi/IndexManagementExtension.kt#L16. You can implement that in ReplicationPlugin, and provide the getISMActionParsers In build.gradle, take a compileOnly dependency on opensearch-index-management-spi. Probably need extendedPlugins = ['opensearch-index-management'] in opensearchplugin {} block And a resources/META-INF/services/org.opensearch.indexmanagement.spi.IndexManagementExtension file that have org.opensearch.replication.ReplicationPlugin in it for the SPI mechanism

bowenlan-amzn avatar Jul 15 '24 15:07 bowenlan-amzn

@aggarwalShivani

Have a PR to publish ISM spi so it can be consumed from central repo https://github.com/opensearch-project/index-management/pull/1207

And here's the commit of extend ISM in CCR https://github.com/opensearch-project/cross-cluster-replication/commit/3853901d77da71e772a71ea623ed1e721ded5d37

~~I tested using your script and can see everything works as expected~~ it actually cannot work out of box, seeing lots of jar hell issue.


@ankitkala I tried the method of using ISM spi, but seems still have the same problem

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.

If we create a new plugin in CCR, like CCR-ISM-extension plugin, it probably will also suffer the same problem that one plugin cannot use the TransportAction from another plugin directly. As plugin has their own classloaders, same class become different after being loaded by different classloaders.

So I think it's better to stick with the current approach now. For your concern

I've some concerns around moving the classes out of cross cluster replication package.

It seems to be a 2 way door, moving classes around probably won't introduce any issues between versions. But let me know any concerns.

bowenlan-amzn avatar Jul 18 '24 16:07 bowenlan-amzn

Hi @ankitkala , Just in case you didn't get a notification for latest reponse by @bowenlan-amzn (as I didn't too)....

We see issues with Option 1 ->

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here. ... If we create a new plugin in CCR, like CCR-ISM-extension plugin, it probably will also suffer the same problem that one plugin cannot use the TransportAction from another plugin directly. As plugin has their own classloaders, same class become different after being loaded by different classloaders.

Even with option 2, it would reverse the problem on ism. And also, if we reuse classes from one plugin in another, we would see class-loading issue i.e. plugins have their own classloaders, same class become different after being loaded by different classloaders. (explained in https://github.com/opensearch-project/notifications/pull/223 as well)

For your concern

I've some concerns around moving the classes out of cross cluster replication package.

To add in addition to Bowen's comment, in our proposed approach, we've ensured that the current REST API users of CCR would not get impacted functionality-wise by any change. ' And other plugins like notifications, alerting also have some of their classes in common-utils.. Pls let us know your thoughts on this.

aggarwalShivani avatar Jul 19 '24 14:07 aggarwalShivani

@ankitkala I tried the method of using ISM spi, but seems still have the same problem

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.

I didn't get this part. All the SPI classes are getting loaded in Opensearch process already, we should just be having it as compile time dependency right?

If we create a new plugin in CCR, like CCR-ISM-extension plugin, it probably will also suffer the same problem that one plugin cannot use the TransportAction from another plugin directly. As plugin has their own classloaders, same class become different after being loaded by different classloaders.

I went through the issue. I still have questions around why we'd want to load same class into JVM twice through different classloaders.

So I think it's better to stick with the current approach now. For your concern

I've some concerns around moving the classes out of cross cluster replication package.

It seems to be a 2 way door, moving classes around probably won't introduce any issues between versions. But let me know any concerns.

I definitely understand the issues with the integration and if we really need to move classes out to common utils, we'll do that. But before we do that, i just want to ensure that we've exhausted all options here :-)

ankitkala avatar Jul 24 '24 15:07 ankitkala

Hi @ankitkala , Thanks for your feedback..

I went through the issue. I still have questions around why we'd want to load same class into JVM twice through different classloaders.

I'll try to answer this from my understanding and @bowenlan-amzn could add-on/correct me as needed :)

In the notifications example as well as in the case of this PR, the flow is - ism plugin needs to invoke actions from other plugins, for ex. SendNotificationAction or TransportStopIndexReplicationAction. These are transport actions in their respective plugins, which accept request of fixed type like StopIndexReplicationRequest.

To invoke these transport action, ism would need to first construct request object of the required type (ex. StopIndexReplicationRequest), invoke the action from the other plugin and finally expect the response of type returned by that action i.e. the request class object is created in ism, but it has to be used in ccr code.

OpenSearch loads each plugin by different class loader separately, which means the same class StopIndexReplicationRequest would have to be recreated/reloaded into ccr's classloader - which causes issues. Same case with response classes too.

aggarwalShivani avatar Jul 29 '24 17:07 aggarwalShivani

have to be recreated/reloaded into ccr's classloader

Hey @aggarwalShivani, what i don't understand is why we need to load the ccr classes at runtime. Compile time dependency makes sense, but at runtime, these classes are already loaded into the opensearch jvm(even if its through a different classloader).

ankitkala avatar Jul 30 '24 03:07 ankitkala

@Ankit Kala commented on Jul 29, 2024, 8:23 PM PDT:

have to be recreated/reloaded into ccr's classloader

Hey @aggarwalShivani, what i don't understand is why we need to load the ccr classes at runtime. Compile time dependency makes sense, but at runtime, these classes are already loaded into the opensearch jvm(even if its through a different classloader).

Valid question. What you are suggesting is to take the stop replication request from a compile dependency of ISM. And we should have that request class at runtime if CCR plugin is also installed in OpenSearch.

@ankitkala can your team help our contributor here to enable this workflow you are thinking? I'm not sure if CCR is publishing any jar for now.

Note, based on how class loaded in OpenSearch, we will need the CCR jar load from opensearch core I guess, otherwise it probably not recognizable from other plugin as it's loaded by CCR plugin classloader?

bowenlan-amzn avatar Aug 05 '24 00:08 bowenlan-amzn

@Ankit Kala commented on Jul 24, 2024, 8:45 AM PDT:

@ankitkala I tried the method of using ISM spi, but seems still have the same problem

If we extend ReplicationPlugin with ISM spi, that means CCR can only be installed after ISM installed (a hard dependency introduced), and there are a lot of jar hells, probably not what we want here.

I didn't get this part. All the SPI classes are getting loaded in Opensearch process already, we should just be having it as compile time dependency right?

If we define CCR as extended plugin of ISM (refer here), when we install the plugin, it hit this check of jarhell https://github.com/opensearch-project/OpenSearch/blob/05fb780a3aec3bfb46c0cb76c65e340b11d42def/server/src/main/java/org/opensearch/plugins/PluginsService.java#L679

It requires that the extension plugin CCR to not have any duplicate dependencies as its extended plugin ISM.

bowenlan-amzn avatar Aug 05 '24 04:08 bowenlan-amzn

@ankitkala can your team help our contributor here to enable this workflow you are thinking? I'm not sure if CCR is publishing any jar for now.

Note, based on how class loaded in OpenSearch, we will need the CCR jar load from opensearch core I guess, otherwise it probably not recognizable from other plugin as it's loaded by CCR plugin classloader?

Hi @ankitkala, Pls let us know if you have any feedback/views on this 🙂

aggarwalShivani avatar Aug 14 '24 12:08 aggarwalShivani

@ankitkala can your team help our contributor here to enable this workflow you are thinking? I'm not sure if CCR is publishing any jar for now. Note, based on how class loaded in OpenSearch, we will need the CCR jar load from opensearch core I guess, otherwise it probably not recognizable from other plugin as it's loaded by CCR plugin classloader?

Hi @ankitkala, Pls let us know if you have any feedback/views on this 🙂

Hi, unfortunately we don't have bandwidth to pick this up. In spirit of moving forward, let's go ahead with the proposed changes.

ankitkala avatar Aug 19 '24 04:08 ankitkala

Hi @ankitkala , @bowenlan-amzn Apologies for the delayed update here. I was assigned different work responsibilities for a few months and couldn't proceed with this feature 😟

I'm back now and trying to revive this change request 🤞 There have been multiple changes in the three projects in this while, and I will have to validate my changes on the latest.

One issue I'm facing is -> Common-utils and ISM projects have now updated to the base-line jdk to 21 from 11. But CCR project is still on jdk11 (open issue). After importing common-utils (built on jdk21) in ccr project (jdk 11), I'm getting this compile-time errors. Cannot inline bytecode built with JVM bytecode version 65 into bytecode that is being built with JVM target 11. Please specify proper '-jvm-target' option

This basically happens when there is a mismatch of jdk in project and its dependency. Pls suggest on how we can solve this. Also, can you help with when jdk21 update is planned on ccr?

aggarwalShivani avatar Dec 22 '24 10:12 aggarwalShivani

Hi @ankitkala Any suggestions here?

One issue I'm facing is -> Common-utils and ISM projects have now updated to the base-line jdk to 21 from 11. But CCR project is still on jdk11 (open issue). After importing common-utils (built on jdk21) in ccr project (jdk 11), I'm getting this compile-time errors. Cannot inline bytecode built with JVM bytecode version 65 into bytecode that is being built with JVM target 11. Please specify proper '-jvm-target' option

This basically happens when there is a mismatch of jdk in project and its dependency (ccr and common-utils here). Pls suggest on how we can solve this. Also, can you help with when jdk21 update is planned on ccr?

We cannot downgrade jdk in common-utils and ccr is not yet on jdk21.

aggarwalShivani avatar Jan 10 '25 13:01 aggarwalShivani

Hi @ankitkala Any suggestions here?

One issue I'm facing is -> Common-utils and ISM projects have now updated to the base-line jdk to 21 from 11. But CCR project is still on jdk11 (open issue). After importing common-utils (built on jdk21) in ccr project (jdk 11), I'm getting this compile-time errors. Cannot inline bytecode built with JVM bytecode version 65 into bytecode that is being built with JVM target 11. Please specify proper '-jvm-target' option This basically happens when there is a mismatch of jdk in project and its dependency (ccr and common-utils here). Pls suggest on how we can solve this. Also, can you help with when jdk21 update is planned on ccr?

We cannot downgrade jdk in common-utils and ccr is not yet on jdk21.

Feel free to raise the PR for bumping the JDK version on CCR.

ankitkala avatar Jan 14 '25 08:01 ankitkala

Feel free to raise the PR for bumping the JDK version on CCR.

Thanks @ankitkala. I've raised a separate PR for that. Request your feedback there.

aggarwalShivani avatar Jan 17 '25 05:01 aggarwalShivani

Hi all Status update: Changes for this are ready to be pushed. However, we have a blocking issue recently. Details below:

  • The common-utils project has moved to using opensearch-3.0.0-alpha1-SNAPSHOT. One breaking change here is that classes like AcknowledgedRequest, AcknowledgedResponse etc are moved from org.opensearch.action.support.master to org.opensearch.action.support.clustermanager.
  • ccr and ism projects are still using opensearch-3.0.0-SNAPSHOT and classes from old org.opensearch.action.support.master package. This inconsistency results in incompatibility issues while using these classes.
  • As discussed over slack with @bowenlan-amzn and @ankitkala, it is suggested to raise PRs on 2.x branches first, until the development issue with the main branches is resolved.
  • For this change, I've raised a backport PR on 2.x branch (#1502).

aggarwalShivani avatar Feb 11 '25 07:02 aggarwalShivani

Hi @ankitkala, @bowenlan-amzn

  • ccr and ism projects are still using opensearch-3.0.0-SNAPSHOT and classes from old org.opensearch.action.support.master package. This inconsistency results in incompatibility issues while using these classes.

As the change for opensearch-3.0.0-alpha1-SNAPSHOT is now merged, ccr now consumes latest common-utils jar. While ism continues to remain blocked, the dev issue for ccr is resolved.

The changes for the stop-replication feature for both common-utlis and ccr are ready and can be reviewed. Backport PR is also ready for review #1502

aggarwalShivani avatar Feb 20 '25 10:02 aggarwalShivani

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1391-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0e5b6eb06a7da87d0091294979a807e914a745fc
# Push it to GitHub
git push --set-upstream origin backport/backport-1391-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-1391-to-2.x.

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

This issue can be ignored, manual backport to 2.x is also merged at https://github.com/opensearch-project/cross-cluster-replication/pull/1502

aggarwalShivani avatar Mar 05 '25 09:03 aggarwalShivani