s5cmd
s5cmd copied to clipboard
feature request: stop using AWS_REGION and "region section of aws profile" for source bucket region
Currently, s5cmd uses the following order for cross region detection:
https://github.com/peak/s5cmd/blob/83ce8bc6a1016bcea46da48e9090f8e761478149/README.md?plain=1#L425-L433
While using the AWS_REGION
or "Region section of AWS profile" makes sense for the destination bucket, it doesn't make as much sense for the source bucket.
I am frequently running into situations where I am trying to copy data across regions as part of an AWS Batch Job, where AWS_REGION is automatically set to the current region where the job is running and the job is copying data to. The job will always fail with:
ERROR "cp s3://mybucket/myprefix/mypath.ext output/file.ext": BucketRegionError: incorrect region, the bucket is not in 'us-west-2' region at endpoint '', bucket is in 'us-east-1' region status code: 301, request id: AAAAAAAAAAAAAAAA, host id: abCdEFghIjkL/1234567=
This is because even though s5cmd can figure out the source bucket region, it assumes that AWS_REGION
is set to the source region which is not the case. To fix this, I am adding unset AWS_REGION
at the top of each of my scripts.
- In my opinion, we should not use
AWS_REGION
or "Region section of AWS profile" for the source bucket region and only use this for the destination bucket region. - Going a step further, perhaps s5cmd should just always detect the region unless the user provides it via CLI flags? Relying on AWS_REGION and the "Region section of AWS profile" is ambiguous. I think it more conventionally applies to the destination, but if there's an easy way to always detect it, maybe that makes more sense?
- the title should probably also reflect "region section of aws profile" to make it clear it's not just AWS_REGION
- see also https://github.com/peak/s5cmd/issues/520 which would also be fixed by not considering
AWS_REGION
orRegion section of AWS profile
for source bucket region