opendal icon indicating copy to clipboard operation
opendal copied to clipboard

Add support to change storage class (e.g. to warm-up data)

Open aawsome opened this issue 1 year ago • 1 comments

Opendal already supports specifying storage classes, e.g. for AWS S3:

default_storage_class = "DEEP_ARCHIVE"

Now, sometimes you want or need to change the storage class. In this example, before being able to access the data in AWS Glacier Deep Archive, you need to warm-up the data by requesting to change the storage class to "STANDARD".

It would be very nice if the opendal service directly supports this kind of storage class change, e.g. something like

let mut builder = S3::default();
builder.bucket("test");
builder.default_storage_class("DEEP_ARCHIVE");
let op: Operator = Operator::new(builder)?.finish();

op.change_state("hello.txt", "STANDARD")?.await?;

There are some cases to consider:

  • The changing might need/supportextra options. E.g. it could be a temporally storage class transition only for a given time period
  • The actual transition might take some time (especially in the case of warming-up data), so the command in fact might return a result from the service.

As a reference, where such things are useful, see https://github.com/rustic-rs/rustic/discussions/692.

aawsome avatar Mar 03 '24 06:03 aawsome

Hi, @aawsome, thank you for bringing up the issue. I've been considering this for some time but haven't found a way to integrate this feature into OpenDAL.

Here are my concerns:

  • Modifying object states seems more akin to control plane operations rather than data plane activities. Perhaps users have better methods of managing those objects through Jobs and lifecycles?

  • This feature is too specific to S3, making it challenging to design consistently across different services.

  • The change_state API overlaps with COPY. Using op.change_state("hello.txt", "STANDARD")?.await achieves the same result as op.copy("hello.txt") with the default storage class STANDARD.

  • Introducing this API would make our public API reveal the storage class, preventing writing uniform code across services.


Could we consider a restore API?

let res = op.restore("archived.tar.zst").await?;

We need to conduct further research by exploring s3, gcs, azblob, and comparing the similarities and differences between these services.

Xuanwo avatar Mar 04 '24 05:03 Xuanwo