minio-py icon indicating copy to clipboard operation
minio-py copied to clipboard

Clarification for docs on `remove_objects()` API

Open ravindk89 opened this issue 2 years ago • 3 comments

https://github.com/minio/minio-py/blob/master/docs/API.md?plain=1#L1473-L1475

As per internal discussion, it's not clear that this API is "lazy" in that it only fires if the user iterates the returned errors.

It also opens a few questions up:

  • If there are no errors, does the command still require iterating the empty return array to fire?
  • If there are some errors, does the API partially execute up until it hits the first error?
  • Is there a way to direct the API to continue on error?

Based on that we should update the docs for this API or any others which require iterating the response to fire (e.g. "lazy API")

ravindk89 avatar Dec 27 '22 17:12 ravindk89

@balamurugana your insight here would be helpful.

ravindk89 avatar Dec 27 '22 17:12 ravindk89

DeleteObjects S3 API enables us to delete multiple objects by limiting to maximum of 1000 objects. In minio-py, remove_objects() is a higher level method which supports removal of more than 1000 objects. This is implemented by sending multiple DeleteObjects S3 API requests sequentially on which each request having 1000 objects. If there are errors from S3 server while executing these requests are yielded. If consumer of these yield prefers to stop the removal of next batch could simply exit from iterating.

As we yield the error, the execution is lazy and returned iterator must be read to continue internal DeleteObjects S3 API requests keep firing. If you prefer more control beyond what remove_objects() method provides could use low level _delete_objects() method by inheriting Minio class or bypassing warning for using private method.

balamurugana avatar Dec 28 '22 02:12 balamurugana

After remove_objects was called, no errors returned and the object was not removed as expected.

My solution was to call remove_object for each object in delete_object_list:

delete_object_list = map(
    lambda x: DeleteObject(x.object_name),
    client.list_objects(MLFLOW_BUCKET, obj.object_name, recursive=True),
)

for del_obj in delete_object_list:
    print(del_obj._name)
    if not DRY_RUN:
        client.remove_object(MLFLOW_BUCKET, del_obj._name)

paramazo avatar Oct 18 '23 07:10 paramazo