"ExtraArgs" option is not available while uploading a file/folder to S3 via CloudPath
If users want to add User defined metadata tags while uploading to S3 they should be able to do so, but currently can't find a way for this using CloudPath function upload_from(). It would be great to have this implementation and I can contribute to enabling this feature if needed.
Thanks for filing @gaisensei! @benbemoh has a sample implementation here which is relevant for the discussion.
I think we'd like to support passing the ExtraArgs, but I want to think a bit about the API design we want. Let's use this issue to explore our options.
A few open questions/thoughts on this topic:
upload_fromanddownload_toare implemented generically (not per service), so we don't want to change the kwargs to be service specificExtraArgscan be specified per-operation, whereas implementations like #253 set it at theS3Clientlevel. It may make sense to set defaults there, but we may also want to support passing these args through to functions explicitly.- If we support setting the default
ExtraArgsinClientinstantiation, is there a way to avoid having a set of long named kwargs likeboto3_upload_extra_argsfor every potentialboto3function. Seems potentially painful to maintain and support if the internals of the clients get refactored - Do we want to support just
ExtraArgsor other potential kwargs as well? I could see just generically collectingsdk_kwargsinupload_fromanddownload_to - As pointed out in #253, not all the
boto3functions we use acceptExtraArgsin the same way.
Thanks for filing @gaisensei! @benbemoh has a sample implementation here which is relevant for the discussion.
I think we'd like to support passing the
ExtraArgs, but I want to think a bit about the API design we want. Let's use this issue to explore our options.A few open questions/thoughts on this topic:
upload_fromanddownload_toare implemented generically (not per service), so we don't want to change the kwargs to be service specificExtraArgscan be specified per-operation, whereas implementations like Add boto3 extra_args field for upload operation #253 set it at theS3Clientlevel. It may make sense to set defaults there, but we may also want to support passing these args through to functions explicitly.
- If we support setting the default
ExtraArgsinClientinstantiation, is there a way to avoid having a set of long named kwargs likeboto3_upload_extra_argsfor every potentialboto3function. Seems potentially painful to maintain and support if the internals of the clients get refactored- Do we want to support just
ExtraArgsor other potential kwargs as well? I could see just generically collectingsdk_kwargsinupload_fromanddownload_to- As pointed out in Add boto3 extra_args field for upload operation #253, not all the
boto3functions we use acceptExtraArgsin the same way.
IMHO, the issue with ExtraArgs in upload_from and download_to methods is that the extra args depends on the implementation: for instance, there exist many aws boto3 methods to upload a file into aws s3 (e.g. boto3.client('s3').upload_file(), boto3.client('s3').upload_fileobj, etc). So if you change the code implementation of upload_from and download_to methods in the future, these extra_args might require a change accordingly.
Besides, a similar context here has been addressed using ExtraArgs in the client instantiation: see boto3_transfer_config and content_type_method: I guess there should be a homogeneous implementation/way to deal with ExtraArgs.
@benbemoh @gaisensei Thanks for the helpful discussion here. We've added the ability to pass extra args to the client in #307.
See the updates to the documentation for how to pass the args to the client: https://cloudpathlib.drivendata.org/stable/authentication/#other-s3-extraargs-in-boto3
You can test it in version 0.12.0, which is on PyPI now.