minio-py
minio-py copied to clipboard
Clarification for docs on `remove_objects()` API
https://github.com/minio/minio-py/blob/master/docs/API.md?plain=1#L1473-L1475
As per internal discussion, it's not clear that this API is "lazy" in that it only fires if the user iterates the returned errors.
It also opens a few questions up:
- If there are no errors, does the command still require iterating the empty return array to fire?
- If there are some errors, does the API partially execute up until it hits the first error?
- Is there a way to direct the API to continue on error?
Based on that we should update the docs for this API or any others which require iterating the response to fire (e.g. "lazy API")
@balamurugana your insight here would be helpful.
DeleteObjects S3 API enables us to delete multiple objects by limiting to maximum of 1000 objects. In minio-py
, remove_objects()
is a higher level method which supports removal of more than 1000 objects. This is implemented by sending multiple DeleteObjects S3 API
requests sequentially on which each request having 1000 objects. If there are errors from S3 server while executing these requests are yield
ed. If consumer of these yield
prefers to stop the removal of next batch could simply exit from iterating.
As we yield
the error, the execution is lazy and returned iterator must be read to continue internal DeleteObjects S3 API
requests keep firing. If you prefer more control beyond what remove_objects()
method provides could use low level _delete_objects()
method by inheriting Minio
class or bypassing warning for using private method.
After remove_objects
was called, no errors returned and the object was not removed as expected.
My solution was to call remove_object
for each object in delete_object_list:
delete_object_list = map(
lambda x: DeleteObject(x.object_name),
client.list_objects(MLFLOW_BUCKET, obj.object_name, recursive=True),
)
for del_obj in delete_object_list:
print(del_obj._name)
if not DRY_RUN:
client.remove_object(MLFLOW_BUCKET, del_obj._name)