smart_open
smart_open copied to clipboard
Allow pass in additional low level API keyword arguments to AWS S3 boto3 put_object
Problem description
Be sure your description clearly answers the following questions:
-
What are you trying to achieve?
currently, the open function calls the
boto3.s3_client.put_object
low level api and onlyBucket, Key, Body
parameters are used. I am trying to pass in additional keyword arguments forput_object
method. -
What is the expected result?
I expect to see an arguments like this:
with smart_open.open("s3://bucket/file.txt", "w", low_level_kwargs=dict(Metadata=dict(owner="[email protected]"))) as f: f.write("hello world")
You can find more additional arguments at https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.put_object
-
What is your suggestion?
I think you could add an optional
low_level_kwargs
dict parameter to youropen
high level API. For some compatible backends like S3, file system.
Steps/code to reproduce the problem
See What is the expected result section
Versions
I don't think it does matter in this use case, but let's do it:
Please provide the output of:
import platform, sys, smart_open
print(platform.platform()) = MacOS
print("Python", sys.version) = 3.8.11
print("smart_open", smart_open.__version__) == 5.2.4
a top-level keyword argument is overkill (we want to keep the signature of the open function simple).
Instead, we could pass them in the client kwargs dict, as happens here: https://github.com/RaRe-Technologies/smart_open/blob/develop/howto.md#how-to-specify-the-request-payer-s3-only
Let me know if you're interested in making a PR.
I believe the functionality requested already exists, as you could specify the low level arguments using the following:
params = {'client_kwargs': {'S3.Client.put_object': {owner : "[email protected]"}}}
The get_attr
function within s3.py will pass in these low level arguments when calling specific s3 functions like put_object.
I was wondering if this issue should be closed?
Yes, I think so.
@RachitSharma2001 @mpenkov actually it is not implemented yet.
Please take a look at the source code: https://github.com/RaRe-Technologies/smart_open/blob/develop/smart_open/s3.py#L1003
def close(self):
if self._buf is None:
return
self._buf.seek(0)
try:
self._client.put_object(
Bucket=self._bucket,
Key=self._key,
Body=self._buf,
)
except botocore.client.ClientError as e:
raise ValueError(
'the bucket %r does not exist, or is forbidden for access' % self._bucket) from e
logger.debug("%s: direct upload finished", self)
self._buf = None
params = {'client_kwargs': {'S3.Client.put_object': {owner : "[email protected]"}}}
the client_kwargs
is not used at all