cloudpathlib icon indicating copy to clipboard operation
cloudpathlib copied to clipboard

S3: Use list_objects_v2 to list objects

Open pjbull opened this issue 4 years ago • 0 comments

The v1 of list_objects can be the cause of some consistency problems. See this answer for context: https://stackoverflow.com/a/67412931/1692709

We currently use list_objects for non recursive cases: https://github.com/drivendataorg/cloudpathlib/blob/80f7afdf85dfb4f3ad0406944a5d3cf28c727435/cloudpathlib/s3/s3client.py#L147

We use the bucket filter in recursive cases: https://github.com/drivendataorg/cloudpathlib/blob/80f7afdf85dfb4f3ad0406944a5d3cf28c727435/cloudpathlib/s3/s3client.py#L136

We should replace both code paths with self.client.get_paginator('list_objects_v2'):

Here's the boto3 docs: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_objects_v2

pjbull avatar Aug 02 '21 15:08 pjbull