OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Enabling Support for Conditional Multi-Part Upload

Open x-INFiN1TY-x opened this issue 8 months ago • 16 comments

Enabling Support for Conditional Multi-Part Upload (#18093)

At a Glance:
Implement both synchronous and asynchronous conditional-upload methods in S3BlobContainer, backed by logical functions in AsyncTransferManager

Method Location Purpose/Functionality
executeMultipartUploadConditionally S3BlobContainer Conditional multipart upload (sync)
asyncBlobUploadConditionally S3BlobContainer Conditional streamed upload (async)
uploadObjectConditionally AsyncTransferManager Core async conditional upload implementation
Helpers in AsyncTransferManager   uploadInOneChunkConditionally
Helpers in AsyncTransferManager   uploadInPartsConditionally
Helpers in AsyncTransferManager   doUploadInPartsConditionally
Helpers in AsyncTransferManager   handleConditionalException

Summary of Changes

  • Add executeMultipartUploadConditionally(...) to S3BlobContainer

  • Add asyncBlobUploadConditionally(...) to S3BlobContainer

  • Implement uploadObjectConditionally(...) in AsyncTransferManager

  • Introduce helper routines: uploadInOneChunkConditionally, uploadInPartsConditionally, doUploadInPartsConditionally, and handleConditionalException

  • Define ConditionalWriteOptions to express “If-Match” and “If-None-Match” conditions

  • Define ConditionalWriteResponse to wrap returned ETag or propagate failures

  • Extend tests to cover conditional success, precondition failures, data integrity, and size boundaries


Core Workflow

  1. Validate part size (≥ 5 MB) and total upload size (≤ 5 TB).

  2. Initialize upload by creating a multipart request with bucket, key, ACL, storage class, metadata, and SSE (AES256 or KMS).

  3. Transfer Data

    • Sync: iterate parts sequentially using AWS SDK, buffering if retries are enabled.

    • Async: delegate to AsyncTransferManager.uploadObjectConditionally, which chooses single-chunk or multipart flow via its helper methods.

  4. Complete Conditionally by setting either If-Match or If-None-Match: * on the complete-upload request based on ConditionalWriteOptions.

  5. Respond with ConditionalWriteResponse.success(etag) on success or translate failures:

    • HTTP 412OpenSearchException("stale_primary_shard") + abort multipart.

    • Other errorsIOException (sync) or failed CompletableFuture/listener (async) + abort if uploadId exists.

    • Abort failures are logged but do not mask the original exception.


Test Coverage

executeMultipartUploadConditionally Test Suite

  • testExecuteMultipartUploadConditionallyWithEtagMatchSuccess

  • testExecuteMultipartUploadConditionallyWithMetadataAndSSE

  • testExecuteMultipartUploadConditionallyContentIntegrity

  • testExecuteMultipartUploadConditionallySizeValidation

  • testExecuteMultipartUploadConditionallyPreconditionFailed

AsyncTransferManager Conditional Upload Tests

  • testConditionalOneChunkUpload

  • testConditionalOneChunkUploadPreconditionFailed

  • testConditionalOneChunkUploadCorruption

  • testConditionalMultipartUploadPreconditionFailed

  • testConditionalMultipartUploadCorruption

Container-Level and Retry Tests

  • testFailureWhenLargeFileRedirectedConditional

  • testLargeFileRedirectedConditional

  • testLargeFileRedirectedConditionalPreconditionFailed

  • testLargeFileRedirectedConditionalConflict

  • testWriteBlobByStreamsConditionalFailure


Related Issue

Concerned RFC : RFC #17763 Parent Meta Issue : Meta #17859

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Flow Diagram for executeMultipartUploadConditionally:

Mermaid Chart - Create complex, visual diagrams with text  A smarter way of creating diagrams -2025-05-29-085837

x-INFiN1TY-x avatar Apr 28 '25 00:04 x-INFiN1TY-x

:x: Gradle check result for 2c31c83de99b630c9954a837fef6aa9da97cbd7a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Apr 28 '25 01:04 github-actions[bot]

Flaky Test :: https://github.com/opensearch-project/OpenSearch/issues/17540 | https://github.com/opensearch-project/OpenSearch/issues/17551

  • classMethod – org.opensearch.repositories.s3.S3BlobContainerRetriesTests

x-INFiN1TY-x avatar Apr 28 '25 04:04 x-INFiN1TY-x

:x: Gradle check result for eb3778c24ef94aecb7606d64d6d47e5d88d51fc6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Apr 28 '25 22:04 github-actions[bot]

We also support async multipart upload(AsyncTransferManager has the logic). Do you want to extend that capability as well? cc: @ashking94

Check https://github.com/opensearch-project/OpenSearch/pull/18064#pullrequestreview-2814267845. I had the same question. Will let this be addressed at both places.

ashking94 avatar May 12 '25 06:05 ashking94

:x: Gradle check result for 9fdefaa757dce30c41d2f252a6345597b3ff95b4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 28 '25 13:05 github-actions[bot]

:x: Gradle check result for d8d8f9aa93d903ba103ac55c95599c768c36943c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 28 '25 14:05 github-actions[bot]

Flaky Test :: https://github.com/opensearch-project/OpenSearch/issues/17551

classMethod – org.opensearch.repositories.s3.S3BlobContainerRetriesTests

x-INFiN1TY-x avatar May 29 '25 05:05 x-INFiN1TY-x

:x: Gradle check result for 7a7e240f6f0f47f78bb20079489ca46e03e4381c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 29 '25 07:05 github-actions[bot]

:x: Gradle check result for 5cfeb6e2fbf26e6f69873a2f10943899622b6227: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 29 '25 09:05 github-actions[bot]

:x: Gradle check result for cbc018847651c6154509515421e06294426be931: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 29 '25 10:05 github-actions[bot]

:x: Gradle check result for 0eedd2cdc061237b1bac19b089ac8ec605834ec1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 29 '25 12:05 github-actions[bot]

Flaky Test :: https://github.com/opensearch-project/OpenSearch/issues/14293

ResourceAwareTasksTests #14293

x-INFiN1TY-x avatar May 30 '25 06:05 x-INFiN1TY-x

We also support async multipart upload(AsyncTransferManager has the logic). Do you want to extend that capability as well? cc: @ashking94

Check #18064 (review). I had the same question. Will let this be addressed at both places.

Hi @Bukhtawar (and thanks again @ashking94 for the nudge)—I’ve now implemented the full async conditional-upload support alongside the sync flow:

  • S3BlobContainer: introduced asyncBlobUploadConditionally(...) as the async entry point.
  • AsyncTransferManager: implemented uploadObjectConditionally(...), dynamically selecting single-chunk or multipart flows.
  • Helper Methods in AsyncTransferManager: added uploadInOneChunkConditionally, uploadInPartsConditionally, doUploadInPartsConditionally, and handleConditionalException to encapsulate the core logic and error translation.

Comprehensive tests have been added to validate every path:

Test suite

  • testConditionalOneChunkUpload
  • testConditionalOneChunkUploadPreconditionFailed
  • testConditionalOneChunkUploadCorruption
  • testConditionalMultipartUploadPreconditionFailed
  • testConditionalMultipartUploadCorruption
  • testFailureWhenLargeFileRedirectedConditional
  • testLargeFileRedirectedConditional
  • testLargeFileRedirectedConditionalPreconditionFailed
  • testLargeFileRedirectedConditionalConflict
  • testWriteBlobByStreamsConditionalFailure

CC : @ashking94 @Bukhtawar @astute-decipher

Mermaid Chart - Create complex, visual diagrams with text  A smarter way of creating diagrams -2025-05-28-172238

x-INFiN1TY-x avatar May 30 '25 06:05 x-INFiN1TY-x

:x: Gradle check result for 656a3a1c522d023b60deeedd5d7400c0c9e6ed2b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 11 '25 11:06 github-actions[bot]

:white_check_mark: Gradle check result for 11dc26466ec5c5185d5bb9fca4107f93456a6cc6: SUCCESS

github-actions[bot] avatar Jun 12 '25 11:06 github-actions[bot]

Codecov Report

Attention: Patch coverage is 69.04762% with 104 lines in your changes missing coverage. Please review.

Project coverage is 72.75%. Comparing base (fe4a98d) to head (11dc264). Report is 49 commits behind head on main.

Files with missing lines Patch % Lines
...ch/repositories/s3/async/AsyncTransferManager.java 58.33% 38 Missing and 17 partials :warning:
...rg/opensearch/repositories/s3/S3BlobContainer.java 77.35% 28 Missing and 8 partials :warning:
.../opensearch/common/blobstore/ConditionalWrite.java 76.92% 9 Missing :warning:
...bstore/AsyncMultiStreamEncryptedBlobContainer.java 0.00% 3 Missing :warning:
...ommon/blobstore/AsyncMultiStreamBlobContainer.java 0.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18093      +/-   ##
============================================
+ Coverage     72.60%   72.75%   +0.14%     
- Complexity    67682    67813     +131     
============================================
  Files          5497     5499       +2     
  Lines        311819   312089     +270     
  Branches      45265    45300      +35     
============================================
+ Hits         226409   227047     +638     
+ Misses        66941    66528     -413     
- Partials      18469    18514      +45     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jun 12 '25 11:06 codecov[bot]

This PR is stalled because it has been open for 30 days with no activity.