s3cmd icon indicating copy to clipboard operation
s3cmd copied to clipboard

[Feature Request]: list-objects-v2 allow unordered support.

Open snosratiershad opened this issue 3 years ago • 6 comments

In ceph storage's s3 interface, there is a flag named "allow-unordered" that allows rgw (the gateway) to don't force the cluster to sort all of objects in case of list-objects-v2 requests. There is also a command option in radosgw-admin (gateway commandline utility) named "--allow-unordered" which provides unordered list to. It's not officially supported by AWS, but included in S3 services which are based on Ceph storage (for example, DigitalOcean Spaces, Redhat Ceph Storage, Alibaba Cloud OSS, Sotoon S3, ...)

Is there any plan to support unordered list in s3cmd?

Update: I found a way to extend aws-cli's with add this file to ~/.aws/models/s3/2006-03-01/. but not available in s3cmd.

snosratiershad avatar Jul 23 '22 12:07 snosratiershad

@snosratiershad Are you sure that it is supported for the list-objets also?

In ceph documentation, I only see this option available in the BucketOps doc page and not in ObjectOps page.

Where you able to test this parameter with DigitalOcean?

fviard avatar Aug 07 '22 18:08 fviard

@fviard, yes! it works on list-objects and list-objects-v2 (and I recently added list-objects-v2 support to boto3 (and aws) ceph extention PR) I will search for ceph documentation, but I'm pretty sure about that, and I use it every day, maintaining 3 ceph clusters with PBs of objects. about digitalocean I've not tested it by my own, but as it's based on ceph, I guess it should handle it too.

Update: I couldn't found more on ceph documentation, only verified that it's included in bucketops documentation, and rgw source code (src/rgw/rgw_rest_s3.cc:1592)

snosratiershad avatar Aug 07 '22 18:08 snosratiershad

Ok, good news.
I can look at adding that, does not cost much I think.

So, to be sure to understand well, you just have to add "allow-unordered" to query param of the list-objects* to have this behavior?

Btw, it has been a long time that I did not have a look at aws-cli, but I would not recommend you to use that option with their sync, as they are relying on the fact that keys are order to decide on which file to synchronize.

fviard avatar Aug 07 '22 20:08 fviard

@fviard

So, to be sure to understand well, you just have to add "allow-unordered" to query param of the list-objects* to have this behavior?

yes!

Btw, it has been a long time that I did not have a look at aws-cli, but I would not recommend you to use that option with their sync, as they are relying on the fact that keys are order to decide on which file to synchronize.

as aws-cli uses boto3, I think it would be a reliable to change behaviors via aws config modules as extention. but I will double check it if you think it's not safe.

snosratiershad avatar Aug 07 '22 23:08 snosratiershad

I can look at adding that, does not cost much I think.

So, I'm going to add the --allow-unordered flag to listing on s3cmd in a PR. I will notify you ASAP.

snosratiershad avatar Aug 07 '22 23:08 snosratiershad

Great if you submit a PR :-)

For aws-cli, my point is that with boto, if you use as API, that should be ok.
But if you do a "sync", it will compare the file list expecting them to be in alphabetic order to use less memory, but with drawbacks like this one.

fviard avatar Aug 07 '22 23:08 fviard

Merged!

fviard avatar Aug 28 '22 00:08 fviard