HDDS-13963. Atomic Create-If-Not-Exists
What changes were proposed in this pull request?
Extends the expectedDataGeneration logic in Ozone Manager to support atomic "create-if-not-exists" semantics.
* Enables passing -1 as the expectedDataGeneration.
* When set to -1, the validateAtomicRewrite logic (in both Create and Commit phases) strictly enforces that the target key must not exist, throwing KEY_ALREADY_EXISTS otherwise.
* This establishes the core OM support required for conditional "If-None-Match" requests, allowing upper layers (like S3 Gateway) to implement these features with minimal changes to the underlying protocol.
for how S3 Put with If-None-Match header request can leverage this, see below flow:
- S3 Gateway Layer
- Parse
If-None-Match: *. - Set
expectedDataGeneration = -1. - Pass to
RpcClient.rewriteKey().
- Parse
- OM Create Phase
- Validate
expectedDataGeneration == -1. - If key exists → throw
KEY_ALREADY_EXISTS. - Store
-1in open key metadata.
- Validate
- OM Commit Phase
- Check
expectedDataGeneration == -1from open key. - If key now exists (race condition) → throw
KEY_ALREADY_EXISTS. - Commit key.
- Check
Race Condition Handling: Using -1 ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the -1 validation check (key now exists), preserving strict create-if-not-exists semantics.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13963
How was this patch tested?
TODO
@ivandika3 Could you take a look at whether this API change makes sense? Since rewrite is intended for existing keys, allowing a semantic like “don’t create if existed” can be confusing. On the other hand, adding another flag to the create-key request in the proto feels redundant. Do you have a better suggestion?
Thanks @peterxcli for the patch, I'll take a look when I have time.
Involving @sodonnel since he's the original creator of Atomic rewrite.
Could you take a look at whether this API change makes sense? Since rewrite is intended for existing keys, allowing a semantic like “don’t create if existed” can be confusing. On the other hand, adding another flag to the create-key request in the proto feels redundant. Do you have a better suggestion?
I think this depends whether atomic rewrite can be cleanly reused for S3 conditional requests. However after another look, I think adding new KeyArgs optional attributes (e.g. allowOverwrite) might be better since atomic rewrite use case depends on the update ID only while S3 conditional requests will need to check ETag (which might need another KeyArgs attributes).
FYI, The concept of "generation" was loosely taken from GCP (https://docs.cloud.google.com/storage/docs/request-preconditions) which supports both request preconditions based on generation (GCP specific) and based on ETag (S3 compatible.
whether atomic rewrite can be cleanly reused for S3 conditional requests.
@ivandika3 I opened https://github.com/apache/ozone/pull/9334 and included a design doc.
Main idea:
- For
If non-matchheader: issue a rewrite-key request with expectDataGeneration = -1 to provide “CREATE IF NOT EXISTS” semantics. - For
If matchheader: fetch key info from OM, validate the ETag at S3G, then set expectDataGeneration to the fetched version. This lets S3G perform optimized concurrency control by leveraging OM’s expectDataGeneration support during the rewrite.
Thanks @peterxcli, left some comments in https://github.com/apache/ozone/pull/9334, let's discuss the design there.
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.