[Experimental] Add a feature flag to start without joining a cluster
Description
This is a rework of the extent of core changes from my proof-of-concept for a "clusterless" OpenSearch. Everything else is implemented in a plugin.
Essentially, if the flag is set, we avoid creating DiscoveryModule or anything that requires it, including GatewayService. We still create ClusterService, but do not initialize a ClusterManagerService. There are a few actions that rely on an injected Discovery instance, so those also need to be removed when the flag is set.
Related Issues
Related to https://github.com/opensearch-project/OpenSearch/issues/17957
Check List
- [ ] Functionality includes testing.
- [ ] API changes companion pull request created, if applicable.
- [ ] Public documentation issue/PR created, if applicable.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
:x: Gradle check result for 83ea4830495dc5234ecdfc61e686963248a6444c: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for df6e71e75f8e0286295f5554e207ac2625a0954d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for 5b3218f14683c77fa1cd24de4e2e2486e6865df5: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
Codecov Report
Attention: Patch coverage is 18.51852% with 66 lines in your changes missing coverage. Please review.
Project coverage is 72.69%. Comparing base (
8f69dcf) to head (1cbb72b). Report is 3 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #18479 +/- ##
============================================
- Coverage 72.79% 72.69% -0.10%
+ Complexity 68525 68460 -65
============================================
Files 5574 5566 -8
Lines 314807 314505 -302
Branches 45675 45633 -42
============================================
- Hits 229178 228644 -534
- Misses 67046 67335 +289
+ Partials 18583 18526 -57
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
:x: Gradle check result for bd91b9b9ced713bebd58013aa820084e048960b0: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 805ce7471ff5537fb175200a61b1530852b2d3ae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 805ce7471ff5537fb175200a61b1530852b2d3ae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 805ce7471ff5537fb175200a61b1530852b2d3ae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 805ce7471ff5537fb175200a61b1530852b2d3ae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 805ce7471ff5537fb175200a61b1530852b2d3ae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 805ce7471ff5537fb175200a61b1530852b2d3ae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 831630e78b9c2d2a9f02292fc32148a804e6911a: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 831630e78b9c2d2a9f02292fc32148a804e6911a: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:white_check_mark: Gradle check result for b92597672dc4f52532d2ba8f3fa94498b300afab: SUCCESS
Now that Gradle check is passing, tagging a few people for feedback on the approach: @andrross, @shwetathareja, @mch2
Thanks!
:x: Gradle check result for 0e2952de695a2b0f9c5664279c15e3d0c998edec: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for dfed82c41944f50a28c6b1dcde1d2b62505a0a8c: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
:x: Gradle check result for 1561f2df0fe78d84e233e50cd1212a598325c66d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Ouch... my latest clean-up removes the casts from ClusterModule, but adds ShardStateAction to the public API.
I can go either way in terms of resolving that. Either add the PublicApi annotation to ShardStateAction or bring back the explicit cast in ClusterModule.
:white_check_mark: Gradle check result for b6bd47b3348ceb539eec36ad291d45fa06038bcc: SUCCESS
Thanks, @rajiv-kv ! I've made a couple of changes in response to your comments. If you get chance, please take a look.
I disagree on exposing the cluster manager operations through LocalClusterService. At least for the time being, I specifically want to set up data nodes and coordinators that are incapable of cluster manager operations. Limiting things like this is not a one-way door, though. If we later decided that we do want to allow cluster manager operations, we can always add them. I definitely don't want to add them now, though, because it means I won't be able to take them away later.
:x: Gradle check result for e44af703ba6e999a6f0140b8f0faf01efd9fb5c5: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:white_check_mark: Gradle check result for 4a83a765e2021b8b49aca5b3959388109a1c7eb4: SUCCESS
:x: Gradle check result for fb125df94d400a8457f718b65ccd9d8fea01502f: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
This looks good to me, once you fix up the compiler errors in the latest commit :)
Pinging @shwetathareja and @rajiv-kv again for follow up reviews as well.
:white_check_mark: Gradle check result for a94fe5f95ed2e61a5c6c1497070dba4bb49a57c9: SUCCESS
This looks good to me, once you fix up the compiler errors in the latest commit :)
Pinging @shwetathareja and @rajiv-kv again for follow up reviews as well.
Done! Checks are passing again :+1:
:x: Gradle check result for 6533ca7e26a9e5d1b8432987dbb2421599bef24d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 6533ca7e26a9e5d1b8432987dbb2421599bef24d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 6533ca7e26a9e5d1b8432987dbb2421599bef24d: null
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?