cloudpathlib
cloudpathlib copied to clipboard
Overly specific asserts on S3Client._remove
Thanks a bunch for this library, I've been trying to make use of it with an s3-emulating service that is not 100% bit-for-bit compliant and I've run into issues in a couple places. Specifically I've noticed that in S3Client._remove, you assert for response==204, the clone-service I'm using responds with a 200 when you successfully delete instead. I'd love to submit a PR to fix this but I need some input. I think it should be acceptable to check if the response is a 2xx code and by doing so you're more flexible. In addition, asserts outside of unit testing are considered hazardous, instead this should raise an exception of some kind. Do you have a preferred exception to be raised here instead? or should it be left to boto to raise the client error?
@mcclurem Out of curiosity, which S3-emulating service are you using?
Happy to take a PR on this. Answers to specific questions below.
I think it should be acceptable to check if the response is a 2xx code and by doing so you're more flexible.
Yep, seems reasonable.
In addition, asserts outside of unit testing are considered hazardous, instead this should raise an exception of some kind.
Agreed that those should raise exceptions, not assert
Do you have a preferred exception to be raised here instead?
You can add a custom exception if there is not already one that makes sense here: https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/exceptions.py
Hi, seems like checking for a 200 or 204 status code (instead of 204 alone) should be enough, since the action performed is a DELETE. About what to do with the exception, seems like any possible error should raise a ClientError
(which is already handled) except if the file does not exist, in which case it won't raise an exception (boto3 docs).
https://github.com/drivendataorg/cloudpathlib/blob/4cc0d33b8e80a7f7eef0fae0f9acb38b8b1f62bf/cloudpathlib/s3/s3client.py#L229-L237
But seems like that case was already covered as mentioned in this comment a couples line further:
https://github.com/drivendataorg/cloudpathlib/blob/4cc0d33b8e80a7f7eef0fae0f9acb38b8b1f62bf/cloudpathlib/s3/s3client.py#L249-L255
A possibility would be to remove both assertions and raise FileNotFoundError
when necessary (if missing_ok
is False
) in both sections. Would that be OK?
def _remove(self, cloud_path: S3Path, missing_ok: bool = True) -> None:
try:
obj = self.s3.Object(cloud_path.bucket, cloud_path.key)
# will throw if not a file
obj.load()
resp = obj.delete()
- assert resp.get("ResponseMetadata").get("HTTPStatusCode") == 204
+ # Assumes that a non-successful code means that the file does not exist
+ if resp.get("ResponseMetadata").get("HTTPStatusCode") not in (200, 204) and not missing_ok:
+ raise FileNotFoundError(f"File does not exist: {cloud_path}")
except ClientError:
# try to delete as a direcotry instead
bucket = self.s3.Bucket(cloud_path.bucket)
prefix = cloud_path.key
if prefix and not prefix.endswith("/"):
prefix += "/"
resp = bucket.objects.filter(Prefix=prefix).delete()
- # ensure directory deleted; if cloud_path did not exist at all
- # resp will be [], so no need to check success
- if resp:
- assert resp[0].get("ResponseMetadata").get("HTTPStatusCode") == 200
- else:
- if not missing_ok:
- raise FileNotFoundError(f"File does not exist: {cloud_path}")
+ # Assumes that a non-successful code means that the file does not exist
+ if not resp and not missing_ok:
+ raise FileNotFoundError(f"File does not exist: {cloud_path}")
Of course, this is just speculation, I would need to test the suggested implementation and confirm which statuses does the response get. This is just a draft ⚠️
By the way, when comparing the _remove
method for the S3, Azure and GS clients seems like the S3 version is quite different, so the exception has no equivalent in the other clients.
@aaossa Yep, that implementation looks good to me initially, thanks. Like you said, it will need some testing. Maybe update the message when you raise FileNotFoundError
to indicate we were trying to delete that file.