OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Add DecommissionService and helper to execute awareness attribute decommissioning

Open imRishN opened this issue 3 years ago • 37 comments

Signed-off-by: Rishab Nahata [email protected]

Description

As part of #3917 to decommission a zone, this PR adds a service to execute zone decommission. A separate PR will created to hook the service with API.

Decommission Flow -

Zone Decommission Flow

Issues Resolved

#4083

Check List

  • [x] New functionality includes testing.
    • [x] All tests pass
  • [x] New functionality has been documented.
    • [x] New functionality has javadoc added
  • [x] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

imRishN avatar Aug 02 '22 06:08 imRishN

Gradle Check (Jenkins) Run Completed with:

  • RESULT: ABORTED :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/1244/
  • CommitID: b592ff08831d1bf45f46aba62f7ffdba0cece9e1

github-actions[bot] avatar Aug 02 '22 08:08 github-actions[bot]

Gave the executor more thoughts and realised we can reuse the existing NodeRemovalClusterStateTaskExecutor and send a batch of tasks for multiple nodes in a particular zone for it to execute. Have made those changes and implemented a service which takes care of end to end lifecycle of decommissioning a zone.

imRishN avatar Aug 18 '22 05:08 imRishN

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/1871/
  • CommitID: e55e2acf592e3c710211e2c1e7258fbf5e44882c

github-actions[bot] avatar Aug 18 '22 06:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2057/
  • CommitID: be1ed65224cd8240e01c20211f8b764d0aa5ff5b

github-actions[bot] avatar Aug 24 '22 10:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2058/
  • CommitID: e51a90d522d07f0a45390632195c6308c5bf6860

github-actions[bot] avatar Aug 24 '22 10:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2059/
  • CommitID: 3d717e06798005a0e7247d6485941a677b695cb1

github-actions[bot] avatar Aug 24 '22 11:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2067/
  • CommitID: d0b00cb5383a116b2af7b3a8a67c98a04b9fd88c

github-actions[bot] avatar Aug 24 '22 15:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2082/
  • CommitID: 1e608692035190fec93c0a5ca711a789bb4620df

github-actions[bot] avatar Aug 24 '22 19:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: null :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2108/
  • CommitID: cb2d3a586220411096a2d04fe60c01c2654d7865

github-actions[bot] avatar Aug 25 '22 13:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2194/
  • CommitID: 9f1b35543152b68dd91e5bd87bb0eeeadef225db

github-actions[bot] avatar Aug 29 '22 16:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2233/
  • CommitID: a5c24a3ea1f105730ce30d00d2c6c3dbd0891a6a

github-actions[bot] avatar Aug 30 '22 07:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2235/
  • CommitID: 34800c57528d7d0f9a768838b112b780ca9db4bf

github-actions[bot] avatar Aug 30 '22 09:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2238/
  • CommitID: 3f038e29b28112df53d0e9d668b3d0ab5e7f2b75

github-actions[bot] avatar Aug 30 '22 11:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2237/
  • CommitID: 508911cca8ad8a3d4e10d3cd9f33aea492101cef

github-actions[bot] avatar Aug 30 '22 11:08 github-actions[bot]

Can you paste the flow chart for decommissioning action (with all success and failure steps) and also if master switches in between?

Added the flow chart to the PR

imRishN avatar Aug 30 '22 14:08 imRishN

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2253/
  • CommitID: 6f0d8a6f36818d0ff85ebb3e8bc1a80c512efa41

github-actions[bot] avatar Aug 30 '22 15:08 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2421/
  • CommitID: 6714021d19fad02350559e250fe334305219ae3a

github-actions[bot] avatar Sep 01 '22 12:09 github-actions[bot]

Codecov Report

Merging #4084 (c2b6e38) into main (7ebb2af) will increase coverage by 0.01%. The diff coverage is 51.05%.

@@             Coverage Diff              @@
##               main    #4084      +/-   ##
============================================
+ Coverage     70.68%   70.70%   +0.01%     
- Complexity    57379    57503     +124     
============================================
  Files          4628     4635       +7     
  Lines        276073   276473     +400     
  Branches      40421    40465      +44     
============================================
+ Hits         195146   195469     +323     
- Misses        64562    64625      +63     
- Partials      16365    16379      +14     
Impacted Files Coverage Δ
...org/opensearch/common/logging/LogConfigurator.java 24.63% <0.00%> (+6.35%) :arrow_up:
...arch/cluster/decommission/DecommissionService.java 22.45% <22.45%> (ø)
...r/decommission/DecommissioningFailedException.java 33.33% <33.33%> (ø)
...ster/decommission/NodeDecommissionedException.java 50.00% <50.00%> (ø)
...er/decommission/DecommissionAttributeMetadata.java 64.55% <64.55%> (ø)
...java/org/opensearch/cluster/metadata/Metadata.java 87.29% <75.00%> (+0.57%) :arrow_up:
...earch/cluster/decommission/DecommissionStatus.java 80.00% <80.00%> (ø)
...h/cluster/decommission/DecommissionController.java 80.24% <80.24%> (ø)
...ch/cluster/decommission/DecommissionAttribute.java 90.00% <90.00%> (ø)
...nsearch/cluster/coordination/JoinTaskExecutor.java 77.24% <92.30%> (+4.52%) :arrow_up:
... and 501 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Sep 01 '22 12:09 codecov-commenter

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2422/
  • CommitID: cbc7d5254edc1c729bc87aad515876d318c021a1

github-actions[bot] avatar Sep 01 '22 12:09 github-actions[bot]

Thanks @imRishN

In the diagram the onfailure of register metadata with status as DECOMMISSION_INIT is not handling the votingConfigExclusion reset. As per my understanding if we are going to return a failure response to the customer and not retry if a failure happens then we should bring the cluster metadata to its original state that was before the decommissioning was initiated.

psychbot avatar Sep 05 '22 13:09 psychbot

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2692/
  • CommitID: 4afc2a8f04fe48b6ed04dc88478d429b38fcb5df

github-actions[bot] avatar Sep 06 '22 11:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2691/
  • CommitID: ee5c1731a0c3f64ca8b3527344ddae19ebec19a7

github-actions[bot] avatar Sep 06 '22 12:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2694/
  • CommitID: f5c94836229d3d86795726c0e471dc4ea87936f2

github-actions[bot] avatar Sep 06 '22 12:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2698/
  • CommitID: 762f4dfec8b7ac2ca5311d9cfa9f343c3d048e3c

github-actions[bot] avatar Sep 06 '22 13:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2858/
  • CommitID: 2cc7308d5e544def55b25ea141b35e70a58a209e

github-actions[bot] avatar Sep 07 '22 20:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2885/
  • CommitID: 96f04a4966185e6e57abcd2df53dc068a3395d01

github-actions[bot] avatar Sep 08 '22 15:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2909/
  • CommitID: 39867e30db21f67bf6359977438365b0f1562204

github-actions[bot] avatar Sep 09 '22 10:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/2910/
  • CommitID: 054cb5a0757043d37eb3ffc3e18218e9ae74420c

github-actions[bot] avatar Sep 09 '22 10:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/3005/
  • CommitID: a136c1719ebb3bd082c62e49c5637cdd15375dba

github-actions[bot] avatar Sep 13 '22 15:09 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/3006/
  • CommitID: a268fd378c51ca5d96f22b5bb4ca49867df41b62

github-actions[bot] avatar Sep 13 '22 15:09 github-actions[bot]