ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-10527. KeyOverwrite with optimistic locking

Open sodonnel opened this issue 11 months ago • 34 comments

What changes were proposed in this pull request?

This change introduces the ability to re-create / overwrite a key in Ozone using an optimistic locking technique.

Say there is a desire to replace a key with some additional data added somewhere in the key, or perhaps change its replication type from Ratis to EC. To do this, you can read the current key data, write a new key with the same name, and then on commitKey, the new key version will be visible.

However, there is a possibility that some other client deletes the original key, or re-writes it at the same time, resulting in potential lost updates.

To replace a key in this way, the proposal is to use the existing objectID and updateID on the key to ensure the key has not changed since it was read. The flow would be:

  1. Get the keyInfo for the current key.
  2. Call the new bucket.overWriteKey() method, passing the details of the existing key
  3. This call will adjust the keyArgs to pass two new fields - overwriteObjectID and updateObjectID which are taken from the objectID and updateID of the existing key.
  4. When OM receives the open key request, it checks that an existing key is present having the passed keyname, objectID and updateID. If not, an error is returned. Otherwise the key is added to the openKeyTable, storing the overwrite IDs.
  5. The data is written to the key as usual.
  6. On key commit, the values stored in the openKey table for the overwrite IDs are checked against the current key. If the current key is absent, or its IDs have changed, the commit will fail and an error is returned. Otherwise the key is committed as usual.

This technique is similar to optimistic locking used in relational databases, to avoid holding a lock on an object for a long period of time.

Notably there are no additional locks needed on OM and no additional calls or rocksDB reads required to implement this - passing and storing the IDs in the openKey table is all that is required. The overwriteIDs don't need to be stored in the keyTable.

This change only added the feature for Object Store buckets for now.

Additionally, there is a question over what to do about meta-data and ACLs. Should they be copied from the existing key, or passed from the client.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10527

How was this patch tested?

New integration and unit tests added.

sodonnel avatar Mar 15 '24 11:03 sodonnel