OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Fix assertion failure while closing remoteStore

Open sachinpkale opened this issue 2 years ago • 9 comments

Description

  • When an instance of Store is created, a shardlock is created which is released on closing the instance of store.
  • Currently, we create 2 instances of store for remote store backed indices: store and remoteStore.
  • As there can be only one shardlock acquired for a given shard, the lock is shared between store and remoteStore.
  • This creates an issue when we are deleting the index as it results in closing both store and remoteStore. At the time of closure of second Store instance, we get the assertion error saying shard is not locked.
  • Ideally, we should be closing the remoteStore but until we work on CompositeStore (https://github.com/opensearch-project/OpenSearch/issues/3719), we mitigate the test failures by closing the remoteDirectory.

Check List

  • [ ] New functionality includes testing.
    • [ ] All tests pass
  • [ ] New functionality has been documented.
    • [ ] New functionality has javadoc added
  • [x] Commits are signed per the DCO using --signoff
  • [ ] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • [ ] Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

sachinpkale avatar Oct 16 '23 03:10 sachinpkale

Compatibility status:

Checks if related components are compatible with change d4c2011

Incompatible components

Incompatible components: [https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git]

github-actions[bot] avatar Oct 16 '23 04:10 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: FAILURE :x:
  • URL: https://build.ci.opensearch.org/job/gradle-check/27907/
  • CommitID: d4c20113234c54cfd177f2d122cfe84c9a85906e Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Oct 16 '23 04:10 github-actions[bot]

Gradle Check (Jenkins) Run Completed with:

  • RESULT: SUCCESS :white_check_mark:
  • URL: https://build.ci.opensearch.org/job/gradle-check/27914/
  • CommitID: d4c20113234c54cfd177f2d122cfe84c9a85906e

github-actions[bot] avatar Oct 16 '23 06:10 github-actions[bot]

This PR is stalled because it has been open for 30 days with no activity.

This still needs a test to get merged.

dblock avatar Nov 26 '23 20:11 dblock

This PR is stalled because it has been open for 30 days with no activity.

Hi @sachinpkale, the PR is stalled. Is this being worked upon?

ticheng-aws avatar Jan 07 '24 17:01 ticheng-aws

This PR is stalled because it has been open for 30 days with no activity.

@sachinpkale Any update on adding tests around this change ?

sohami avatar Feb 14 '24 20:02 sohami

Closing this for now, will re-visit later.

sachinpkale avatar Feb 26 '24 10:02 sachinpkale