boto3 icon indicating copy to clipboard operation
boto3 copied to clipboard

Use iterator for s3 object collection

Open dazza-codes opened this issue 5 years ago • 6 comments

>>> objects
s3.Bucket.objectsCollection(s3.Bucket(name='my-project'), s3.ObjectSummary)

>>> print(next(objects))
TypeError: 's3.Bucket.objectsCollection' object is not an iterator

https://wiki.python.org/moin/Iterator

It does support iter(objects) wrapping, e.g.

>>> obj_iter = iter(objects)
>>> obj_iter
<generator object ResourceCollection.__iter__ at 0x7f57efbff660>

But why is this necessary?

dazza-codes avatar Mar 13 '19 21:03 dazza-codes

The objects are not meant to be consumed directly, instead a filter of some kind is intended to be put on the end:

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.objects

Examples:

>>> objects = s3.Bucket(name='test').objects.filter(Prefix='file0.zip')
>>> for o in objects:
>>>    print(o)
s3.ObjectSummary(bucket_name='test', key='file0.zip')

Or all of them:

>>> objects = s3.Bucket(name='test').objects.all()
>>> for o in objects:
>>>    print(o)
s3.ObjectSummary(bucket_name='test', key='file0.zip')
s3.ObjectSummary(bucket_name='test', key='file1.zip')
s3.ObjectSummary(bucket_name='test', key='file3.zip')
s3.ObjectSummary(bucket_name='test', key='file4.zip')

stealthycoin avatar Mar 15 '19 17:03 stealthycoin

Not sure if the latest release already supports this, but it seems like all() should return an iterable.

objects = s3.Bucket(name='test').objects.all()
next(objects) -> s3.ObjectSummary

The docs, e.g. https://boto3.amazonaws.com/v1/documentation/api/latest/guide/collections.html, explicitly indicate that A collection provides an iterable interface to a group of resources.

It seems like the distinction between iterable and iterator is important [1] and what the intention is for the collections. If the intention is to be an iterable and not an iterator, it's good to go and close this issue at will.

[1] https://www.geeksforgeeks.org/python-difference-iterable-iterator/

dazza-codes avatar Mar 21 '19 22:03 dazza-codes

Yeah you should definitely be able to call next. Marking as a feature request.

JordonPhillips avatar Apr 01 '19 16:04 JordonPhillips

Not sure if the latest release already supports this, but it seems like all() should return an iterable.

objects = s3.Bucket(name='test').objects.all()
next(objects) -> s3.ObjectSummary

The docs, e.g. https://boto3.amazonaws.com/v1/documentation/api/latest/guide/collections.html, explicitly indicate that A collection provides an iterable interface to a group of resources.

It seems like the distinction between iterable and iterator is important [1] and what the intention is for the collections. If the intention is to be an iterable and not an iterator, it's good to go and close this issue at will.

[1] https://www.geeksforgeeks.org/python-difference-iterable-iterator/

Until this is natively supported, i think you can do next(x for x in objects) reference: https://stackoverflow.com/a/2364277/3679900

y2k-shubham avatar Jun 04 '19 11:06 y2k-shubham

Alternatively, you may also use something like:

...
bucket_iter = iter(objects)
next(bucket_iter)
...

This works since all Generators (which are used in boto3 to yield API results) are Iterators and all Iterators are Iterables.

Aeternitaas avatar Aug 31 '19 07:08 Aeternitaas

map() should also be able to operate over all() - shouldn't it ?

MikeWhittakerRyff avatar Apr 09 '20 11:04 MikeWhittakerRyff