cloudpathlib icon indicating copy to clipboard operation
cloudpathlib copied to clipboard

Add boto3 extra_args field for upload operation

Open benbemoh opened this issue 3 years ago • 4 comments

There are many use cases where uploading unencrypted objects to an Amazon S3 bucket is denied. For example an S3 bucket policy which enforces the encryption requirement when users upload objects, unless they are using client-side or server-side encryption (see this aws article for details).

This PR addresses the case of using a server-side encryption when uploading objects to an Amazon S3 bucket.

Below is a python code example to illustrate how to use the new changes:

extra_args = {"ServerSideEncryption": "AES256"}
client = S3Client(boto3_upload_extra_args=extra_args)
client.set_as_default_client()

@pikulmar: Thank you for the suggestions and for reviewing the PR

benbemoh avatar Aug 17 '22 16:08 benbemoh

Thanks @benbemoh. Can you file a related issue where discussion of this change can happen? I see the value, but I want to make sure we cover all the potential ExtraArgs usages (not just upload) in one go, and we've got a clean API for passing them.

pjbull avatar Aug 18 '22 00:08 pjbull

@pjbull : Thanks for your reply. Two use-cases that comes to my mind which I know that they require also ExtraArgs encryption:

  1. When generating a presigned-s3-url for HTTP post method in order to upload file (generate_presigned_post). I see that there is a related issue opened here.
# Example of code to generate an s3 presigned url for HTTP POST to upload a file 
fields = {"x-amz-server-side-encryption": "AES256"}
conditions = {"x-amz-server-side-encryption": "AES256"}
 info = {
      "Bucket": bucket_name,
      "Key": object_name,
      "Fields": fields,
      "Conditions": conditions,
      "ExpiresIn": expiration,
  }
  try:
      response: Dict = s3_client.generate_presigned_post(**info)
  except ClientError as e:
      extra_info = f"Error in generate_presigned_post with: {info}."
      raise RuntimeError(e, extra_info)
  1. When copying S3 objects using these two methods S3.Object.copy_from() and S3.Object.copy(). I think the cloudpathlib code here can be updated like this:
def _move_file(self, src: S3Path, dst: S3Path, remove_src: bool = True) -> S3Path:
    # just a touch, so "REPLACE" metadata
    if src == dst:
        o = self.s3.Object(src.bucket, src.key)
        o.copy_from(
            CopySource={"Bucket": src.bucket, "Key": src.key},
            Metadata=self._get_metadata(src).get("extra", {}),
            MetadataDirective="REPLACE",
            ServerSideEncryption=self.boto3_upload_extra_args.get("ServerSideEncryption"),
        )

    else:
        target = self.s3.Object(dst.bucket, dst.key)
        target.copy(
            CopySource={"Bucket": src.bucket, "Key": src.key},
            ExtraArgs=self.boto3_upload_extra_args.copy(),
        )

        if remove_src:
            self._remove(src)
    return dst

benbemoh avatar Aug 18 '22 06:08 benbemoh

@pjbull : Could you please give us a feed-back regarding the two points mentioned above?

benbemoh avatar Aug 23 '22 12:08 benbemoh

Thanks for the patience @benbemoh. @gaisensei has filed a related issue, #254 where we can discuss this.

pjbull avatar Sep 02 '22 18:09 pjbull