OpenSearch
OpenSearch copied to clipboard
Optimize remote state stale file deletion
Description
Created an async task AsyncStaleFileDeletion
which is initialized on only master eligible nodes. Once the task in initialized in RemoteClusterStateService
start()
, it schedules a clean up after specified interval which is added as a dynamic setting cluster.remote_store.state.cleanup_interval
with 5 min default.
Clean up is also proceeded with if we have more than 10 successful states updates since last clean up. After trying clean up once, we schedule the task again after the set interval.
Related Issues
Resolves #12889 Resolves #12798 that test case is getting removed in this PR
Check List
- [x] New functionality includes testing.
- [x] All tests pass
- [x] New functionality has been documented.
- [x] New functionality has javadoc added
- [x] Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
- [x] Commits are signed per the DCO using --signoff
- [x] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Compatibility status:
Checks if related components are compatible with change 3b7464c
Incompatible components
Skipped components
Compatible components
Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git]
:x: Gradle check result for c320b67d8ab5c9dd571a82f0b48a8f6908ead3f3: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 3b7464c571bebf91587c4c475ca350278775ba7b: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Removing the "Storage:Remote" as this is purely cluster manager state. @shwetathareja @Bukhtawar should we create a separate label for remote cluster state to avoid duplication between storage for data vs storage for cluster state? May be "Storage:Remote" can be used for data and "Storage:RemoteState" for cluster state?
:x: Gradle check result for ad17589fd0072ff63cae8de0d00e46bab47e0113: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for d4f09e2c2c90ead111b56e66153f82bbfa3c6b04: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 3e2601eb9edb6022e7d731a72ba7c8987b7a5db3: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for de616128e22784996df9d21e8147437cff5a2802: null
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 3408bd70664aed36a53cbb422754b446b086a9ac: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 00a36ab4168be3d7fe076676e5b8a22532f71680: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 1febb6816e3a628e7826b7c502961b1dc6c7d4be: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 8f5d7d7f1977b9e7685521175648195b014c2c38: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 4e3c9e92856247fe56242aea4ad3b7b876e68727: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 5ea5dbb565dee48ac276950afb7b5191e54bc8d1: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
[Storage Triage - attendees 1 2 3 4 5 6 7 8 9 10 11 12 13]
@shiv0408 Thanks for taking this up. Lets add a release target label to this PR.
Looks good. Minor comments
:x: Gradle check result for 04140ff03ca9d4764fc0421b3a3cd1e0811a8653: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 1a6940f45adb59a69bbd11177a68b62631433709: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Gradle build is failing, please fix the build
:x: Gradle check result for 2f2719e06b9853596c58a950a7db445193f1ed4b: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringQueryPhase
- New issue created for this flaky test #13674
org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationDuringFetchPhase
- Already identified as flaky in #5426
org.opensearch.remotemigration.RemoteReplicaRecoveryIT.testReplicaRecovery
- flaky #13473
:x: Gradle check result for 2f2719e06b9853596c58a950a7db445193f1ed4b: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 3ff82b335a88779e95357800f1309a56d1e9e1db: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for bb09f56d552cc000c660903812a9a6de8f6cc953: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for 12632fcd9d8f186ba06fe938abda6d25d9534199: UNSTABLE
- TEST FAILURES:
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadRangeBlobWithRetries
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
Codecov Report
Attention: Patch coverage is 77.83784%
with 41 lines
in your changes are missing coverage. Please review.
Project coverage is 71.60%. Comparing base (
b15cb0c
) to head (2f8d2e1
). Report is 315 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #13131 +/- ##
============================================
+ Coverage 71.42% 71.60% +0.18%
- Complexity 59978 61331 +1353
============================================
Files 4985 5064 +79
Lines 282275 288089 +5814
Branches 40946 41715 +769
============================================
+ Hits 201603 206288 +4685
- Misses 63999 64794 +795
- Partials 16673 17007 +334
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:x: Gradle check result for 38259512faef6e7d0a227d0b0b9326833bb3db8d: null
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b604b090ad5d5298644099b650d0fe67f51ca4f5: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 507941508007d983af16d16d6f2e55ec7389d4a2: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 7bd079fb227ca661a297653ebe53cb282fdac74f: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?