Gaurav Bafna
Gaurav Bafna
> I'm not convinced we should move forward with this pull request, it looks like a small bandaid on widely impacting scenario, please help me better understand why this change...
> @gbbafna Thanks for your thoughts. In the issue @harishbhakuni mentioned an approach to execute the deletes in batches rather than individual tasks. This approach sounds easier to deliver in...
Closing this issue as we are relying on @harishbhakuni 's fix https://github.com/opensearch-project/OpenSearch/pull/12319 . Will revisit later if we see any issues .
We need to rethink about the `PublicApi` annotation . Lets talk about `IndexShard` constructor , which is although `public` , but kind of private to opensearch and is not meant...
>It would help collect those for release notes and generally call attention to more eyes that can review a breaking change, including marking existing classes such as IndexShard as @InternalApi....
> > @dblock , we can't mark them as @internalapi, as of now as the `gradle check` itself will fail. If we allow that on maintainer's judgement and not enforce...
"that was causing an issue where if shard md upload to snapshot repository fails, it will not release the lock file from S3." - How is that happening ? https://github.com/opensearch-project/OpenSearch/blob/829215c4a4660ee65026c40a0a01ebe100b3ee2c/server/src/main/java/org/opensearch/snapshots/SnapshotShardsService.java#L438-L462...
Ran this for 1k iterations and didn't get a failure. Will continue running in until a failure is observed .
While running locally, getting failure in 500 runs ``` [2025-03-03T19:09:56,894][ERROR][o.o.i.t.t.TranslogTransferManager] [node_t0] [target][1] Exception occurred while cleaning translog at path=[W01011001000101][kJ9uOs4eSG2EhNaSFiwaVA][1][translog][data] java.io.IOException: access denied: /local/home/gbbafna/git/OpenSearch/server/build/testrun/internalClusterTest/temp/org.opensearch.action.admin.indices.create.RemoteSplitIndexIT_CC16184ED99D441C-001/tempDir-002/repos/tbnAiwbhZD/W01011001000101/kJ9uOs4eSG2EhNaSFiwaVA/1/translog/data/2/translog-45.tlog at org.apache.lucene.tests.mockfile.WindowsFS.checkDeleteAccess(WindowsFS.java:117) ~[lucene-test-framework-10.1.0.jar:10.1.0 884954006de769dc43b811267230d625886e6515 - 2024-12-17 16:15:44]...
Closed by https://github.com/opensearch-project/OpenSearch/pull/18329